Re: [GNUnet-developers] Proposal: Make GNUnet Great Again?

gnunet-developers
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [GNUnet-developers] Proposal: Make GNUnet Great Again?

From:	Schanzenbach, Martin
Subject:	Re: [GNUnet-developers] Proposal: Make GNUnet Great Again?
Date:	Sun, 10 Feb 2019 10:50:03 +0100

> On 10. Feb 2019, at 10:36, Florian Dold <address@hidden> wrote:
> 
> On 2/10/19 1:55 PM, Schanzenbach, Martin wrote:
> 
>>> An example for such
>>> tooling would be Googles's Repo tool
>>> (https://source.android.com/setup/develop /
>>> https://source.android.com/setup/develop/repo).
>> 
>> 
>> Actually, google is an example for a proponents of monorepos. So your point 
>> is moot here.
>> They need all this tooling _because_ they use a single repo.
> 
> Not quite.  While most of Google's code is in a monorepo (that doesn't
> use git), Android is split over multiple git repos.
> 
> Also they need all this tooling because they have many projects in one
> monorepo.  What I'm saying here is that we shouldn't split up one
> *project* too much.  Then we don't need that extra tooling!
> 
>> Did you even read my last comment? Do you really consider all of the 
>> applications as one "GNUnet" that every
>> user (and developer!) actually cares about?
>> I can tell you the number of times I used / developed something for fs / 
>> social: 0.
>> And, of course smaller repos make CI faster. It will result in smaller 
>> builds (and, more importantly, builds which actually build things that have 
>> changed).
>> And please no arguments for stateful builds/runners. I hope we can at least 
>> agree that tests and builds should be done in clean environments every time.
>> Else you will not catch a lot of stuff that can go wrong (I experienced this 
>> myself when I setup my docker builds for GNUnet which, unlike BB, actually 
>> build from scratch)
> 
> Is tracking dependencies between repos really that much easier than
> tracking dependencies between subdirectories?
> 
> You are (at least I think) making the assumption that high-level GNUnet
> repos would depend on stable versions of GNUnet?
> 
> Otherwise, the cost for CI would be the same.  If I make a change in
> GNUnet-base, of course I'll also have to run tests for everything that
> depends on it.  Are you suggesting to just not do integration testing
> anymore?

I don't see why this is so hard to understand.
Of course still do testing. And a successful build of the code _plus_ a 
successful test of the core should trigger the build+tests of depending apps.
The gist is:

1. It only triggers if the build+test of the core actually pass
2. The triggered component (e.g. reclaim) does not care about failing 
builds/tests in fs. It will still work without it.

> 
>> How would that happen? Can you give a _concrete_ example in 
>> fs/social/reclaim where this is true?
>> It is exactly the point that it is completely unclear what effects a 
>> configuration switch on what components.
>> If we separate this, we might get _some_ overhead in the configure.ac's but 
>> from my experience I expect this to be very little.
>> And our configure.ac is not something I would consider "developer friendly" 
>> through its sheer size and complexity.
> 
> Let's say I want to enable verbose logging.  Then I need to re-configure
> and re-compile all of the repos I depend on, in the right order.  That
> sucks!
> 
> (Re-linking doesn't suffice, as logging is done via CPP macros.)

Not really. If I build reclaim, I will probably use a docker image, say 
gnunet:latest for my debugs builds and gunet:<version> for my release builds 
which have logging enabled/disabled. This is pretty easy to ensure.
As I said, you are thinking in too much in "every component in own repo".

> 
>> And think about it this way: If a new developer decides to write a new 
>> service / application on top of GNUnet, what will he be faced with?
>> Image this app needs only GNS and maybe CADET.
>> The dev will need to integrate and understand the full build and test in 
>> order to properly setup this project.
>> If we had a few separate examples of how this can be done in a separate 
>> repo, this would go a long way.
> 
> I don't see any issues with "incubating" new GNUnet
> applications/libraries in their own repo.  Maybe some of them can then
> be later integrated with the main gnunet.git.

At this point, there is not benefit in integrating it back.

> 
> (I just saw that in your latest email, you're suggesting exactly this
> for reclaimid.  That's is fine, but of course now makes it hard to see
> for other GNUnet devs locally whether they are breaking your code.
> Well, at least the CI should eventually complain.)

That is also the point. They should not care. Do you really think Gtk+ devs 
care if they break API/ABI and gnunet-gtk fails to build?

> 
> - Florian
> 
>>> On 2/10/19 1:02 AM, Amirouche Boubekki wrote:
>>>> I think splitting the codebase will be a pain for gnunet.
>>>> 
>>>> The only *good* reasons for manyrepos are social or ego politics "this
>>>> is my lawn" or legal. The only one that applies to gnunet is legal
>>>> because one needs to fill a gnu form to be able to contribute.
>>>> 
>>>> I am biased toward monorepo by experience dealing with big project
>>>> (100k+ SLOC) and the only time it made sens to split the project into
>>>> many repositories because it was different teams / workflow (social) and
>>>> different legal terms for the various services/daemons, at previous
>>>> $WORK, they had to fork gentoo to make it work.
>>>> 
>>>> Otherwise, each time I saw another repository it was a source of pain:
>>>> 
>>>> - Need to manage several versions
>>>> - git submodule workflow is not good enough, it doesn't track branch, I
>>>> personally I never remember how to know the branch of a commit, plus it
>>>> requires some more git-fu to bump the submodule.
>>>> - refactoring anyone?
>>>> - generally speaking manyrepos at small scale is more work
>>>> 
>>>> And again, it requires somehow to track down every versions (what works
>>>> with what) and you end up with another repository (or distribution) with
>>>> another build system that puts everything together. Continuous
>>>> Integration can do that? Where is the code of the CI? Another repo? More
>>>> versions, more git clone more grep across repositories / directories not
>>>> even in sync.
>>>> 
>>>> Popularity arguments:
>>>> 
>>>> a) Ok, everybody know GAFAM love monorepos and that is a also a source
>>>> of pain (dedicated team and software). That said, gnunet is not the size
>>>> of any GAFAM, hence it will not suffer from monorepo pain points.
>>>> 
>>>> b) Github and Javascript made the manyrepos popular for various ego
>>>> reasons and because JavaScript is not good. I won't take inspiration
>>>> from that part of the JavaScript noosphere. gnunet-leftpad anyone?
>>>> 
>>>> c) Now, there is GNOME. GNOME is famous for its bazaar model of
>>>> development and also famous for the adoption of meson (maybe even its
>>>> inception) or its previous incarnation jhbuild. Anyway, even if GNOME
>>>> and GNU (which is also a bazaar) success is appealing, gnunet is not GNU
>>>> or GNOME. From my point of view the bazaar development model scales
>>>> better / more easily in a socially distributed setting. Also why Linux
>>>> is still a single repository?
>>>> 
>>>> Le sam. 9 févr. 2019 à 18:16, Schanzenbach, Martin
>>>> <address@hidden <mailto:address@hidden>> a écrit :
>>>> 
>>>> 
>>>> 
>>>>> On 9. Feb 2019, at 17:13, Christian Grothoff <address@hidden
>>>>   <mailto:address@hidden>> wrote:
>>>>> 
>>>>> On 2/9/19 5:04 PM, Schanzenbach, Martin wrote:
>>>>>> I have some inline comments as well below, but let us bring this
>>>>   discussion down to a more practical consensus maybe.
>>>>>> I think we are arguing too much in the extremes and that is not
>>>>   helpful. I am not saying we should compartmentalise
>>>>>> GNUnet into the tiniest possible components.
>>>>>> It's just that I think it is becoming a bit bloated.
>>>>>> 
>>>>>> That being said, _most_ of what is in GNUnet today is perfectly
>>>>   fine in a single repo and package.
>>>>>> For now, at least let us not add another one (gtk) as well?
>>>>>> 
>>>>>> Then, we remain with
>>>>>> 
>>>>>> - reclaim (+the things reclaim needs wrt libraries)
>>>>>> - conversation (+X)
>>>>>> - secureshare (+X)
>>>>>> - fs (+X)
>>>>>> 
>>>>>> as components/services on my personal "list".
>>>>>> I suggest that _if_ I find the time, I could extract reclaim into
>>>>   a separate repo as soon as we have a CI and I can
>>>>>> test how it works and we can learn from the experience.
>>>>>> Then, we can discuss if we want to do the same with other
>>>>   components, one at a time, if there is consensus and a person that
>>>>>> would be willing to take ownership (I am pretty sure we talked
>>>>   about this concept last summer as well).
>>>>> 
>>>>> Maybe you could start with extracting the SecuShare components? That
>>>>> should do for a first "experience", and be a bit more effective at
>>>>> reducing bloat as well ;-).
>>>> 
>>>>   Well, I could, but our secushare people are quite active so maybe
>>>>   there are volunteers (if they agree with the proposal at all).
>>>>   Regarding "bloat". If we want to effectively eliminate bloat than
>>>>   let's look at numbers:
>>>> 
>>>>   File Sharing:
>>>>   src/fs: 36918 (!) LOC in .c files
>>>>   src/datastore/cache: ~15k LOC in .c files
>>>> 
>>>>   Conversation:
>>>>   src/conversation: 10538 LOC in .c files
>>>> 
>>>>   SecuShare:
>>>>   src/psyc* : ~17000 LOC in .c files (altough I am not sure about this
>>>>   because theoretically psyc is a general use protocol, no?)
>>>>   src/social: 9447 LOC in .c files
>>>>   src/multicast: 5633 LOC in .c files
>>>> 
>>>>   Reclaim:
>>>>   src/reclaim* : ~6500 LOC in .c files
>>>> 
>>>>   Now, considering that fs is practically always built for everybody
>>>>   and SecuShare and reclaim are experimental, it hurts the most for
>>>>   devs that actually compile from source.
>>>>   Everything combined are 110000+ LOC which is 22% of the codebase
>>>>   (~500k, oO). Considering that there is a significant redundancy in
>>>>   transport/ (75k) at the moment, this number is probably closer to 25%.
>>>>   Granted, this is a lot less than I expected ;), but maybe
>>>>   illustrates the dimensions.
>>>> 
>>>> 
>>>>> 
>>>>> That said, splitting of reclaim seems also much less problematic than
>>>>> fs/conversation, and if you then integrate reclaim with the libgabe
>>>>> tree, the overall number of downloads/installation for reclaim
>>>>   wouldn't
>>>>> go up, so that would certainly kill my argument of making the
>>>>> installation more complex (might indeed simplify it, as one
>>>>   doesn't have
>>>>> to remember to install libgabe before GNUnet to get reclaim).
>>>> 
>>>>   Could do, but libgabe has some nasty additional deps (libpbc and
>>>>   gmp) which we _might_ eventually get rid of completely by
>>>>   implementing GNS-based encryption.
>>>> 
>>>> 
>>>>   _______________________________________________
>>>>   GNUnet-developers mailing list
>>>>   address@hidden <mailto:address@hidden>
>>>>   https://lists.gnu.org/mailman/listinfo/gnunet-developers
>>>> 
>>>> 
>>>> _______________________________________________
>>>> GNUnet-developers mailing list
>>>> address@hidden
>>>> https://lists.gnu.org/mailman/listinfo/gnunet-developers
>>>> 
>>> 
>>> _______________________________________________
>>> GNUnet-developers mailing list
>>> address@hidden
>>> https://lists.gnu.org/mailman/listinfo/gnunet-developers
>> 
>
signature.asc
Description: Message signed with OpenPGP
[Prev in Thread]
Current Thread
[Next in Thread]
Re: [GNUnet-developers] Proposal: Make GNUnet Great Again?, (continued)
Prev by Date: Re: [GNUnet-developers] Proposal: Make GNUnet Great Again?
Next by Date: Re: [GNUnet-developers] Proposal: Make GNUnet Great Again?
Previous by thread: Re: [GNUnet-developers] Proposal: Make GNUnet Great Again?
Next by thread: Re: [GNUnet-developers] Proposal: Make GNUnet Great Again?
Index(es):
- Date
- Thread