[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [GSoC 2017] Point-to-point
From: |
Joan Lledó |
Subject: |
Re: [GSoC 2017] Point-to-point |
Date: |
Sun, 6 Aug 2017 19:25:25 +0200 |
2017-08-06 19:03 GMT+02:00 Justus Winter <justus@gnupg.org>:
> Joan Lledó <joanlluislledo@gmail.com> writes:
>
>> The last item in my initial TODO list was implementing the --peer
>> option available in pfinet. This week I've been working on it and it's
>> now done, so I can say the LwIP translator is now able to replace
>> pfinet.
>
> And indeed, I'm using lwip on my development box right now:
>
> root@hurdbox ~ # fsysopts /servers/socket/2
> /hurd/lwip --interface=/dev/eth0m/0 --address=192.168.122.246
> --netmask=255.255.255.0 --gateway=192.168.122.1
> --address6=FE80::5054:28FF:FE44:31B6/64
>
> This is excellent work, thank you very much. So what remains to be done
> until we can merge this is to address the scalability issues. I'll
> paste the discussion from
> http://richtlijn.be/~larstiq/hurd/hurd-2017-07-25
> here so that we can discuss this further:
>
> 11:59:43< teythoon> i see two issues with the lwip codebase, and we need to
> talk about how we address them in a way compatible with the goals of the
> upstream project
> 12:01:11< teythoon> jlledom: issue 1/ are the static arrays for sockets etc
> 12:01:21< teythoon> we need to be able to dynamically allocate such objects
> 12:01:25< jlledom> OK
> 12:01:37< teythoon> but we need to introduce it so that it is acceptable
> upstream
> 12:01:47< teythoon> so it must not degrade the performance on embedded systems
> 12:02:10< teythoon> maybe we can introduce an abstraction with two backends
> 12:02:13< teythoon> of some kind
> 12:02:27< teythoon> macros or functions that can be inlined
> 12:02:47< teythoon> but it should not add any overhead
> 12:02:54< teythoon> not in code size, not in memory
> 12:03:23< jlledom> I'm gonna eat, bbl
> 12:03:40< teythoon> you seem to be working good with the lwip community, i'd
> suggest to raise the issue somewhere, maybe the list
> 12:03:42< teythoon> ok
> 12:03:45< teythoon> i'll keep writing
> 12:04:31< teythoon> point 2/ is performance
> 12:04:35< teythoon> or related to performance
> 12:04:50< teythoon> it is not that important though, but worth keeping an eye
> on
> 12:05:10< teythoon> i did some experiments on tcp throughput using iperf
> 12:06:04< teythoon> for me, pfinet was doing double the throughput lwip did
> 12:06:22< teythoon> now that may also be due to how i compiled lwip
> 12:06:34< teythoon> i haven't done better tests
> 12:06:37< teythoon> sorry
> 12:06:50< teythoon> but i am worried about the one global lock that lwip uses
> 12:07:08< teythoon> that on embedded systems is implemented by disabling
> interrupts
> 12:07:30< teythoon> which is very fast and on singleprocessor systems very
> cheap
> 12:08:18< teythoon> but on posix platforms (or in the code in your repo) lwip
> is replacing that with one badly implemented recursive lock on top of a
> pthread mutex
> 12:08:30< teythoon> and that is a very heavyweight primitive in comparison
> 12:09:16< teythoon> there, it is possible with some work to introduce a
> better abstraction
> 12:12:32< teythoon> currently, there are two macros, SYS_ARCH_PROTECT(lev)
> and SYS_ARCH_UNPROTECT(lev)
> 12:12:58< teythoon> now, lev is a local variable that stores processor flags
> 12:13:14< teythoon> this is how the disabling of interrupts is made recursive
> 12:14:05< teythoon> the old state is saved in lev when locking, the
> interrupts are disabled, and the saved state restored on unlocking
> 12:15:15< teythoon> i'd suggest to replace this with per-resource locks on
> posix systems
> 12:15:48< teythoon> for that, a per-resource variable must be defined (in
> addition to the per-critical section local variables)
> 12:16:13< teythoon> then, both must be passed to SYS_ARCH_PROTECT, e.g.
> SYS_ARCH_PROTECT(lock,lev)
> 12:16:34< teythoon> now, on embedded system we can leverage 'lev' to
> implement the former scheme
> 12:16:47< teythoon> and on posix systems, we can just use
> pthread_mutex_lock(lock)
> 12:18:09< teythoon> using posix mutexes with attribute PTHREAD_MUTEX_RECURSIVE
> 12:19:53< teythoon> (big picture: i'm worried about contention on that lock)
> 12:20:24< teythoon> so we need to introduce a more fine-grained locking system
> 12:20:32< teythoon> again without imposing any overhead on small systems
> 12:20:47< teythoon> this should be done using macros by extending
> SYS_ARCH_PROTECT and friends
>
> Now I realize that you said the performance issue may be due to
> insufficient TCP tuning, but I'm still worried about contention. But
> problem 1/ is a killer, we cannot arbitrarily restrict the number of
> sockets like that.
>
The problem of static arrays only affects sockets in our case, because
LwIP allows the user to use malloc() for all other objects. I'm
working on supporting malloc() for sockets too, and hope to solve it
during next week.
>
> Cheers,
> Justus