ccrtp-devel
[Top][All Lists]
Advanced

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Ccrtp-devel] Seeking help porting ccrtp to OpenBSD 3.9


From: Michael Grigoni
Subject: Re: [Ccrtp-devel] Seeking help porting ccrtp to OpenBSD 3.9
Date: Sat, 01 Jul 2006 15:47:39 -0500
User-agent: Mozilla/5.0 (Windows; U; Windows NT 5.0; en-US; rv:1.7.2) Gecko/20040804 Netscape/7.2 (ax)

Hi Federico,

Thanks much for your reply.


- The problem with ccrtptest seems related to port 34566 being
  unavailable for binding or any kind of conflict.


There is nothing on that port but for a test I changed it to 2000
in ccrtptest.cpp; the abort happened in exactly the same manner.

  Did you tried
  'ccrtptest --send' and 'ccrtptest --recv'

There doesn't appear to be any arg parsing on 'ccrtptest.cpp';
I changed the variable assignments  for 'send' and 'recv'
one at a time from 'false' to 'true':

   for 'send', ccrtptest runs silently for about five seconds and
   quits. It doesn't appear to bind a udp port

   for 'recv', ccrtptest runs silently for about eleven seconds and
   quits. It _does_ bind udp port 2000 (I changed it from 34566 --
   see above)

A last question. Are you currently able to get twinkle to work even
though some things fail?


Michel was concerned about OpenBSD's pthreads POSIX compliance; pthreads
in 3.9 conform to "ISO/IEC 9945-1 ANSI/IEEE (``POSIX'') Std 1003.1
Second Edition 1996-07-12". A suite of regression tests verifies
this functionality. It has a working recursive mutex. According
to recent discussions in the OpenBSD lists, fairness of scheduling
should not be a problem for an application which conforms to the
POSIX threads specifications.

Here is Michel's latest responses:

This does not look nice. I think most of these problems are due
to thread scheduling issues. In the log file you can see that
the far-end sends a 200 OK when you answer the call. The twinkle
listener thread receives it and prints the message in the log.
The listener thread will handover the message to the transaction
manager thread. But it seems this thread does not get processing
time for a long time. The far-end keeps retransmitting the
200 OK and the listener thread receives them, so the listener
thread seems to get processing time. Then all of a sudden the
transaction manager thread gets time and the 200 OK is
processed. Why doesn't the transaction manager thread gets
processing time when it has work to do?

I am not familiar with OpenBSD, but I did a quick search on Google.
I see some articles telling the OpenBSD implements userland threads
which do not provide true concurrency.

The architecture of Twinkle heavily relies on fair thread
scheduling. During a call more than 13 threads run simultaneously.
Synchronization between the threads is mostly based on semaphores.
On Linux with a posix thread implementation this works fine.
I think you need an expert on OpenBSD and threading to look into
these issues. If OpenBSD does not implement Posix compliant threads
then I think you'll have a hard time to get Twinkle working.
Another thing to look at is recursive mutex. Twinkle uses recursive
mutexes. Not all OS's do support there, I believe.

And then there is the painful coexistence of threads and
signals. Twinkle supports LinuxThreads and NPTL to correctly
handle SIGALARM.

Having said this, some of the freezes may be due to bugs in
Twinkle. I wouldn't be surprised if there are still some
race conditions that may lead to a deadlock. I never experience
them, but with a different thread scheduling algorithm, such
bugs will rear their ugly heads.

(pretty much stuck with processes). My brief reading of LinuxThreads
descriptions leads me to believe that those threads are mapped to
processes (not very efficient?).

Yep, that was in the 2.4 kernel, though the scheduling was good
enough for Twinkle. Later kernels have NPTL instead of LinuxThreads.
Threads are not mapped to processes anymore. I remember I had
some nasty thread scheduling problems when going from LinuxThreads
to NPTL; I had one thread holding a mutex and another thread doing
a lock on the mutex, so the second thread gets blocked. Then the
first thread released the mutex and quickly after it wanted to lock
the mutex again. In LinuxThreads, the second thread got the mutex
as soon as it was released by the first. In NPTL the first thread
got the lock again and the second thread starved!

There shouldn't be threads anymore in Twinkle that continuously
lock and unlock mutexes.

I realize this is a daunting problem; I hope you have a few moments
to look at the manpage and let me know if anything in it is
going to be fatal to twinkle.

I had a quick look. I find it hard to judge if there are
real show stoppers hear. The pthread calls look good to me, but
I cannot tell how exactly the thread scheduler will schedule the
threads.

Most of the threads in Twinkle do something like this in
their mainloop:

while (true) {
sem_wait
get event from queue
do something with the event
}

When the queue is empty, the semaphore will be 0 and sem_wait blocks.
Another thread puts a message in the queue and calls sem_post,
so the sem_wait can now return.

Question is: when will the thread scheduler allow the thread calling
sem_wait to run?

I cannot answer that question.

The listener thread does a blocking I/O read on the UDP
socket, so it can run when a UDP packet arrives.

The ccrtp library also creates some threads. I don't know
what their mainloop looks like.

Same for the Qt mainloop.

You might try and experiment with pthread_setschedparam and
try round robin scheduling. I wanted to use that too, but on Linux
a process needs root privileges to do that.

On Linux I have 3 scheduling algorithms:

1) SCHED_OTHER - regular non-realtime.
This is what Twinkle uses, so it is not realtime but good enough.
2) SCHED_FIFO - realtime fifo (need root privileges)
3) SCHED_RR - realtime round robin (need root privileges)

In your man pages I only see SCHED_FIFO and SCHED_RR.

Another call that looks interesting is: pthread_multi_np()

The pthread_multi_np() function causes the process to return
to multi-threaded scheduling mode.

I have no idea what it exactly does. I don't have this call.

Maybe there are some tools for debugging threads to see which
threads run at what times. I don't know as I didn't need such
a tool.

--------------------------------------------------------------

Here are the BUGS described in OpenBSD's pthreads manpage:

     The library contains a scheduler that uses the process
     virtual interval timer to pre-empt running threads.
     This means that using setitimer(2) to alter the process
     virtual timer will have undefined effects.  The
     SIGVTALRM will never be delivered to threads in a process.

     Some pthread functions fail to work correctly when linked
     using the -g option to cc(1) or gcc(1).  The problems do
     not occur when linked using the -ggdb option.

I will relink ccrtp without '-g' and report the results.

Regards,

Michael









reply via email to

[Prev in Thread] Current Thread [Next in Thread]