From libev at schmorp.de Sat Nov 3 16:15:07 2007 From: libev at schmorp.de (Marc Lehmann) Date: Sat Nov 3 16:15:18 2007 Subject: [Libevent-users] announcing libev, towards a faster and more featureful libevent Message-ID: <20071103201507.GA11441@schmorp.de> Hi! On tuesday, I sent mail about various problems with libevent and its current API as well as implementation. Unfortunately, the mail has not yet shown up, but fortunately, it has been superseded by this one :) In that mail, I announced that I will work on the problems I encountered in libevent (many of which have been reported and discusssed earlier on this list). After analyzing libevent I decided that it wasn't fixable except by rewriting the core parts of it (the inability to have multiple watchers for the same file descriptor event turned out to be blocking for my applications, otherwise I wouldn't have started the effort in the first place...). The results look promising so far: I additionally implemented a libevent compatibility layer and benchmarked both libraries using the benchmark program provided by libevent: http://libev.schmorp.de/bench.html Here is an incomplete list of what I changed and added (see the full list at http://cvs.schmorp.de/libev/README, or the cvs repository at http://cvs.schmorp.de/libev/): fixed or improved: * multiple watchers can wait for the same event, there is no limitation to one or two watchers for signals and io. * there is full support for fork, you can continue to use the event loop in the parent and child (or just one of them), even with quirky backends such as epoll. * there are two types of timers, based on real time differences and wall clock time (cron-like). timers can also be repeating and be reset at almost no cost (for idle timeouts used by many network servers). time jumps get detected reliably in both directions with or without a monotonic clock. * timers are managed by a priority queue (O(1) for important operations as opposed to O(log n) in libevent, also resulting in much simpler code). * event watchers can be added and removed at any time (in libevent, removing events that are pending can lead to crashes). * different types of events use different watchers, so you don't have to use an i/o event watcher for timeouts, and you can reset timers seperately from other types of watchers. Also, watchers are much smaller (even the libevent emulation watcher only has about 2/3 of the size of a libevent watcher). * I added idle watchers, pid watchers and hook watchers into the event loop, as is required for integration of other event-based libraries, without having to force the use of some construct around event_loop. * the backends use a much simpler design. unlike in libevent, the code to handle events is not duplicated for each backend, backends deal only with file descriptor events and a single timeout value, everything else is handled by the core, which also optimises state changes (the epoll backend is 100 lines in libev, as opposed to >350 lines in libevent, without suffering from its limitations). As for compatibility, the actual libev api is very different to the libevent API (although the design is similar), but there is a emulation layer with a corresponding event.h file that supports the event library (but no evbuffer, evnds, evhttp etc.). It works very well: both the testsuite as well as the benchmark program from libevent work as expected, and the evdns code compiles (and runs) without any changes, making libev a drop-in replacement for libevent users. "Porting" evbuffer and so on would be equally trivial. However: libev has not yet been released on its own (only as part of the EV perl module), meaning there is no configure script, C-level documentation etc. The "obvious" plan would be to take the evhttp etc. code from libevent and paste it in to libev, making libev a complete replacement for libevent with an optional new API. The catch is, I'd like to avoid this, because I am not prepared to maintain yet another library, and I am not keen on replicating the configure and portability work that went into libevent so far. So, would there be an interest in replacing the "core" event part of libevent with the libev code? If yes, there are a number of issues to solve, and here is how I would solve them: * libev only supports select and epoll. Adding poll would be trivial for me, and adding dev/poll and kqueue support would be easy, except that I don't want to set-up some bsd machines just for that. I would, however, opt to write kqueue and /dev/poll backends "dry" and let somebody else do the actual porting stuff to the then-existing backends. * libev only supports one "event base". The reason is that I think a leader/ follower pattern together with lazy I/O would beat any multiple event loop implementation, and the fact that each time I see some software module using its own copy of some mainloop meant that it doesn't work together with other such modules :) Still, if integration should be achieved, this is no insurmountable obstacle :-> * libev uses its own api, which potentially causes some global symbol spamming (lots of functions with ev_ prefix). This could be used as a chance to switch to a more efficient API while at the same time keeping 100% backwards compatibility. (The event emulation layer in libev is only 1.3kb of codesize on my machine, so reasonably thin). So what do you think? If you are not interested, I am prepared to maintain it on my own, too, so there is no pressure for either direction, I think. Greetings, and thanke for making me write my own event loop something I had planned to do for many years now :) Also kudos for the event_once interface, which I hadn't implemented if I hadn't seen it in libevent before. -- The choice of a Deliantra, the free code+content MORPG -----==- _GNU_ http://www.deliantra.net ----==-- _ generation ---==---(_)__ __ ____ __ Marc Lehmann --==---/ / _ \/ // /\ \/ / pcg@goof.com -=====/_/_//_/\_,_/ /_/\_\ From william at 25thandClement.com Sat Nov 3 18:45:39 2007 From: william at 25thandClement.com (William Ahern) Date: Sat Nov 3 18:45:56 2007 Subject: [Libevent-users] announcing libev, towards a faster and more featureful libevent In-Reply-To: <20071103201507.GA11441@schmorp.de> References: <20071103201507.GA11441@schmorp.de> Message-ID: <20071103224539.GA20579@wilbur.25thandClement.com> On Sat, Nov 03, 2007 at 09:15:07PM +0100, Marc Lehmann wrote: > In that mail, I announced that I will work on the problems I encountered > in libevent (many of which have been reported and discusssed earlier on > this list). After analyzing libevent I decided that it wasn't fixable > except by rewriting the core parts of it (the inability to have multiple > watchers for the same file descriptor event turned out to be blocking for > my applications, otherwise I wouldn't have started the effort in the first > place...). A good itch, indeed. > The results look promising so far: I additionally implemented a libevent > compatibility layer and benchmarked both libraries using the benchmark > program provided by libevent: http://libev.schmorp.de/bench.html > > Here is an incomplete list of what I changed and added (see the full > list at http://cvs.schmorp.de/libev/README, or the cvs repository at > http://cvs.schmorp.de/libev/): Man. More pressure to rename my library from "libevnet" to something else ;) > * there is full support for fork, you can continue to use the event loop > in the parent and child (or just one of them), even with quirky backends > such as epoll. Curious how you managed to do this. Are you checking the process PID on each loop? > * there are two types of timers, based on real time differences and wall > clock time (cron-like). timers can also be repeating and be reset at > almost no cost (for idle timeouts used by many network servers). time jumps > get detected reliably in both directions with or without a monotonic clock. But then they're not truly "real-time", no? > * event watchers can be added and removed at any time (in libevent, > removing events that are pending can lead to crashes). This is news to me. Can you give more detail, maybe with pointers to code? > * different types of events use different watchers, so you don't have > to use an i/o event watcher for timeouts, and you can reset timers > seperately from other types of watchers. Also, watchers are much smaller > (even the libevent emulation watcher only has about 2/3 of the size of a > libevent watcher). libevnet does this for I/O; timer is always set separately from read/write events. (Point being, its using libevent.) > * I added idle watchers, pid watchers and hook watchers into the event loop, > as is required for integration of other event-based libraries, without > having to force the use of some construct around event_loop. Needing to do an operation on every loop is arguably very rare, and there's not much burden in rolling your own. PID watchers, likewise... how many spots in the code independently manage processes (as opposed to one unit which can just catch SIGCHLD). Also, curious how/if you've considered Win32 environments. > * the backends use a much simpler design. unlike in libevent, the code to > handle events is not duplicated for each backend, backends deal only > with file descriptor events and a single timeout value, everything else > is handled by the core, which also optimises state changes (the epoll > backend is 100 lines in libev, as opposed to >350 lines in libevent, > without suffering from its limitations). libevnet optimizes state changes. Logically every I/O request is single-shot (which is more forgiving to user code), but it actually sets EV_PERSIST and delays libevent bookkeeping until the [libevnet bufio] callback returns. If the user code submits another I/O op from its callback (highly likely) then the event is left unchanged. It's still re-entrant safe because it can detect further activity up the call chain using some stack message passing bits (instead of reference counting because I also use mem pools, but I digress). Again, point being this can be done using libevent as-is. > As for compatibility, the actual libev api is very different to the > libevent API (although the design is similar), but there is a emulation > layer with a corresponding event.h file that supports the event library > (but no evbuffer, evnds, evhttp etc.). Well... if you can persuade me of the utility then this Christmas I might want to investigate writing an evdns-like component. See the "lookup" component of libevnet. There are lots of nice things I need in a DNS resolver that evdns and others are incapable of handling. And I've also written more HTTP, RTSP, and SOCKS5 parsers than I can remember. > The "obvious" plan would be to take the evhttp etc. code from libevent and > paste it in to libev, making libev a complete replacement for libevent > with an optional new API. The catch is, I'd like to avoid this, because I > am not prepared to maintain yet another library, and I am not keen on > replicating the configure and portability work that went into libevent so > far. If you ask me, it would prove more fortuitous to re-write the DNS and HTTP components then to replace libevent. Reason being because it would be hard to substantively improve on DNS/HTTP without altering the API, whereas clearly its feasible to improve libevent under the hood without altering the existing API, and then building your new features on top of this. That's sort of what I did with libevnet, by adding buffered I/O, DNS, and thread management API atop libevnet. > > So, would there be an interest in replacing the "core" event part of > libevent with the libev code? If yes, there are a number of issues to > solve, and here is how I would solve them: Win32 support is important to me, unfortuantely. As it likely is to others. libevent has a very, very large installed base of users. > * libev only supports select and epoll. Adding poll would be trivial for me, > and adding dev/poll and kqueue support would be easy, except that I don't > want to set-up some bsd machines just for that. I would, however, opt to > write kqueue and /dev/poll backends "dry" and let somebody else do the > actual porting stuff to the then-existing backends. I always thought it would be easier to just create kqueue wrappers around epoll, poll, select, et al, and then build a library on that. Once you start adding things like PID events, etc, its at least worth some thought. > * libev only supports one "event base". The reason is that I think a > leader/ follower pattern together with lazy I/O would beat any multiple > event loop implementation, and the fact that each time I see some > software module using its own copy of some mainloop meant that it > doesn't work together with other such modules :) Still, if integration > should be achieved, this is no insurmountable obstacle :-> So you (1) want to support multiple destination events for the same source but (2) let multiple threads signal discrete events? That sounds like an invitation for even more trouble. Now you've got mutexes littered all over the place. That's my version of a nightmare. And if you group discrete events into the same thread, then you haven't solved the CPU workload problem. I too prefer a single event base. The obvious problem being when you actually have CPU intensive work, say an MPEG streaming service--lots of I/O--which also needs to do intermedate AV processing--lots of CPU. But now systems such as Linux are adding semaphore/mutex events, so its economical and easy to independently coordinate worker threads with the main event loop. This lets developers keep tight reign on the flow of processing and minimize concurrency headaches. You get the best of both worlds at their peak efficiency. By putting thread management into the loop you've created both advantages and disadvantages; in this instance symmetry isn't so elegant. In other words, arguably as much baggage has been added as problem solving utility. Though, I think we've already had this debate... so... I'll just shut-up ;) From libev at schmorp.de Sat Nov 3 18:52:10 2007 From: libev at schmorp.de (Marc Lehmann) Date: Sat Nov 3 18:52:13 2007 Subject: [Libevent-users] test/*.c files use installed header files not the ones from the package Message-ID: <20071103225210.GC11441@schmorp.de> during integration of libev into libevent, I found that the test/* files include the *installed* header files instead of the ones from the package. this doesn't work if there are any structure changes (obviously the structure change in this case was going to libev). fixing the #include statements to use "" fixed this. -- The choice of a Deliantra, the free code+content MORPG -----==- _GNU_ http://www.deliantra.net ----==-- _ generation ---==---(_)__ __ ____ __ Marc Lehmann --==---/ / _ \/ // /\ \/ / pcg@goof.com -=====/_/_//_/\_,_/ /_/\_\ From provos at citi.umich.edu Sat Nov 3 19:18:19 2007 From: provos at citi.umich.edu (Niels Provos) Date: Sat Nov 3 19:18:27 2007 Subject: [Libevent-users] test/*.c files use installed header files not the ones from the package In-Reply-To: <20071103225210.GC11441@schmorp.de> References: <20071103225210.GC11441@schmorp.de> Message-ID: <850f7cbe0711031618o42a96b0dt232912d533456953@mail.gmail.com> Hi Marc, I appreciate your insights, but your message has nothing to do with libevent. The make files in libevent use the -I option to provide the path to the header files. I also noticed that you seem to have found several bugs in libevent. It would be nice if you could send patches for them. I am a little bit dubious about some of the claims such as not being able to remove an event while in a callback. Thanks, Niels. On 11/3/07, Marc Lehmann wrote: > during integration of libev into libevent, I found that the test/* files > include the *installed* header files instead of the ones from the package. > > this doesn't work if there are any structure changes (obviously the structure > change in this case was going to libev). > > fixing the #include statements to use "" fixed this. > > -- > The choice of a Deliantra, the free code+content MORPG > -----==- _GNU_ http://www.deliantra.net > ----==-- _ generation > ---==---(_)__ __ ____ __ Marc Lehmann > --==---/ / _ \/ // /\ \/ / pcg@goof.com > -=====/_/_//_/\_,_/ /_/\_\ > _______________________________________________ > Libevent-users mailing list > Libevent-users@monkey.org > http://monkey.org/mailman/listinfo/libevent-users > > From pfisher at alertlogic.net Sat Nov 3 19:56:30 2007 From: pfisher at alertlogic.net (Paul Fisher) Date: Sat Nov 3 19:55:10 2007 Subject: [Libevent-users] [PATCH] Allow evhttp server to use alternative event_base Message-ID: <307EFBA19B7B8D41A3E823644410291F302712@exchange01.alertlogic.net> Skipped content of type multipart/alternative-------------- next part -------------- A non-text attachment was scrubbed... Name: 003_evhttp_event_base.patch Type: text/x-patch Size: 3930 bytes Desc: 003_evhttp_event_base.patch Url : http://monkeymail.org/archives/libevent-users/attachments/20071103/0d7d67ad/003_evhttp_event_base.bin From provos at citi.umich.edu Sat Nov 3 19:56:07 2007 From: provos at citi.umich.edu (Niels Provos) Date: Sat Nov 3 19:56:13 2007 Subject: [Libevent-users] [PATCH] TAILQ_ENTRY missing in evhttp.h on linux In-Reply-To: <307EFBA19B7B8D41A3E823644410291F30270D@exchange01.alertlogic.net> References: <307EFBA19B7B8D41A3E823644410291F30270D@exchange01.alertlogic.net> Message-ID: <850f7cbe0711031656w4d22397dr9d0130d48e7bdd87@mail.gmail.com> Try #include before including evhttp.h Niels. On 10/30/07, Paul Fisher wrote: > > > > In using the latest 1.3e on linux, evhttp.h fails to compile because of a > missing definition of TAILQ_ENTRY in evhttp.h. This is due to the fact that > the workaround in event.h is #define'd and #undef'd within event.h and not > available to evhttp.h when defining "struct evhttp_request". This patch > obviously fixes it: > > --- libevent-1.3e/evhttp.h 2007-08-25 13:49:22.000000000 -0500 > +++ libevent-1.3e.002/evhttp.h 2007-10-29 22:32:07.000000000 -0500 > @@ -108,7 +108,14 @@ > * reasonable accessors. > */ > struct evhttp_request { > +#if defined(TAILQ_ENTRY) > TAILQ_ENTRY(evhttp_request) next; > +#else > +struct { \ > + struct type *tqe_next; /* next element */ \ > + struct type **tqe_prev; /* address of previous next element */ \ > +} next; > +#endif > > /* the connection object that this request belongs to */ > struct evhttp_connection *evcon; > > ... but it would be nice if this was coordinated with the definition in > event.h, possibly by simply not #undef'ing it from event.h. Anyway, if > there is a preference on how to fix this, I'd be glad to regenerate the > patch. > > > -- > > paul > _______________________________________________ > Libevent-users mailing list > Libevent-users@monkey.org > http://monkey.org/mailman/listinfo/libevent-users > > From provos at citi.umich.edu Sat Nov 3 19:57:44 2007 From: provos at citi.umich.edu (Niels Provos) Date: Sat Nov 3 19:57:58 2007 Subject: [Libevent-users] [PATCH] Allow evhttp server to use alternative event_base In-Reply-To: <307EFBA19B7B8D41A3E823644410291F302712@exchange01.alertlogic.net> References: <307EFBA19B7B8D41A3E823644410291F302712@exchange01.alertlogic.net> Message-ID: <850f7cbe0711031657u58c69801hc889b06107138f0f@mail.gmail.com> Hi Paul, thank you very much for your patch. Unfortunately, something very similar is already present in trunk: /** Create a new HTTP server * * @param base (optional) the event base to receive the HTTP events * @return a pointer to a newly initialized evhttp server structure */ struct evhttp *evhttp_new(struct event_base *base); /** * Start an HTTP server on the specified address and port. * * Can be called multiple times to bind the same http server * to multiple different ports. * * @param address a string containing the IP address to listen(2) on * @param port the port number to listen on * @return a newly allocated evhttp struct * @see evhttp_free() */ int evhttp_bind_socket(struct evhttp *http, const char *address, u_short port); Niels. On 11/3/07, Paul Fisher wrote: > > > > The attached patch allows an evhttp server to properly operate on an > alternative event_base that is set via a new interface evhttp_base_start(), > which is also added by this patch. Basically this makes the http.c > implementation apply the event_base present in the struct evhttp instance > associated with the evhttp_connection/evhttp_request > whenever an event is scheduled via event_set/event_add. > > A separate patch will need to handle the evhttp clients. > > > -- > > paul > > _______________________________________________ > Libevent-users mailing list > Libevent-users@monkey.org > http://monkey.org/mailman/listinfo/libevent-users > > > From clayne at anodized.com Sat Nov 3 20:18:51 2007 From: clayne at anodized.com (Christopher Layne) Date: Sat Nov 3 20:18:57 2007 Subject: [Libevent-users] [PATCH] TAILQ_ENTRY missing in evhttp.h on linux In-Reply-To: <850f7cbe0711031656w4d22397dr9d0130d48e7bdd87@mail.gmail.com> References: <307EFBA19B7B8D41A3E823644410291F30270D@exchange01.alertlogic.net> <850f7cbe0711031656w4d22397dr9d0130d48e7bdd87@mail.gmail.com> Message-ID: <20071104001851.GD2724@ns1.anodized.com> On Sat, Nov 03, 2007 at 04:56:07PM -0700, Niels Provos wrote: > Try > > #include > > before including evhttp.h > > Niels. Why is this a usercode issue? Shouldn't evhttp.h be more interested in handling it's dependencies than non-event parent code? It's similar to the being a usercode, but event.h dependency as well. I mean why not make the parent code just handle all includes for event.h and evhttp.h - down to stdint.h while we're at it? Because that would be ridiculous - and the child header should handle it's own dependencies. I know Rob Pike may have thought at one time one shouldn't include other include files - but that was also 1989. -cl From pfisher at alertlogic.net Sat Nov 3 20:21:50 2007 From: pfisher at alertlogic.net (Paul Fisher) Date: Sat Nov 3 20:22:33 2007 Subject: [Libevent-users] [PATCH] TAILQ_ENTRY missing in evhttp.h on linux References: <307EFBA19B7B8D41A3E823644410291F30270D@exchange01.alertlogic.net> <850f7cbe0711031656w4d22397dr9d0130d48e7bdd87@mail.gmail.com> Message-ID: <307EFBA19B7B8D41A3E823644410291F302714@exchange01.alertlogic.net> Then why does event.h go to the trouble of #ifndef TAILQ_ENTRY to handle the same issue? And why not have event.h include the file directly? -- paul -----Original Message----- From: provos@gmail.com on behalf of Niels Provos Sent: Sat 11/3/2007 6:56 PM To: Paul Fisher Cc: libevent-users@monkey.org Subject: Re: [Libevent-users] [PATCH] TAILQ_ENTRY missing in evhttp.h on linux Try #include before including evhttp.h Niels. On 10/30/07, Paul Fisher wrote: > > > > In using the latest 1.3e on linux, evhttp.h fails to compile because of a > missing definition of TAILQ_ENTRY in evhttp.h. This is due to the fact that > the workaround in event.h is #define'd and #undef'd within event.h and not > available to evhttp.h when defining "struct evhttp_request". This patch > obviously fixes it: > > --- libevent-1.3e/evhttp.h 2007-08-25 13:49:22.000000000 -0500 > +++ libevent-1.3e.002/evhttp.h 2007-10-29 22:32:07.000000000 -0500 > @@ -108,7 +108,14 @@ > * reasonable accessors. > */ > struct evhttp_request { > +#if defined(TAILQ_ENTRY) > TAILQ_ENTRY(evhttp_request) next; > +#else > +struct { \ > + struct type *tqe_next; /* next element */ \ > + struct type **tqe_prev; /* address of previous next element */ \ > +} next; > +#endif > > /* the connection object that this request belongs to */ > struct evhttp_connection *evcon; > > ... but it would be nice if this was coordinated with the definition in > event.h, possibly by simply not #undef'ing it from event.h. Anyway, if > there is a preference on how to fix this, I'd be glad to regenerate the > patch. > > > -- > > paul > _______________________________________________ > Libevent-users mailing list > Libevent-users@monkey.org > http://monkey.org/mailman/listinfo/libevent-users > > -------------- next part -------------- An HTML attachment was scrubbed... URL: http://monkeymail.org/archives/libevent-users/attachments/20071103/ac8f144c/attachment-0001.htm From libev at schmorp.de Sat Nov 3 20:28:00 2007 From: libev at schmorp.de (Marc Lehmann) Date: Sat Nov 3 20:28:03 2007 Subject: [Libevent-users] [PATCH] TAILQ_ENTRY missing in evhttp.h on linux In-Reply-To: <20071104001851.GD2724@ns1.anodized.com> References: <307EFBA19B7B8D41A3E823644410291F30270D@exchange01.alertlogic.net> <850f7cbe0711031656w4d22397dr9d0130d48e7bdd87@mail.gmail.com> <20071104001851.GD2724@ns1.anodized.com> Message-ID: <20071104002800.GA25462@schmorp.de> On Sat, Nov 03, 2007 at 05:18:51PM -0700, Christopher Layne wrote: > On Sat, Nov 03, 2007 at 04:56:07PM -0700, Niels Provos wrote: > > Try > > > > #include > > > > before including evhttp.h > > > > Niels. > > Why is this a usercode issue? Shouldn't evhttp.h be more interested in This is especially troubling as this is a nonportable header file only available on BSDs in general. > handling it's dependencies than non-event parent code? It's similar to the > being a usercode, but event.h dependency as well. The difference is that sys/time.h is at leats part of some standard (POSIX). > not make the parent code just handle all includes for event.h and evhttp.h > - down to stdint.h while we're at it? Because that would be ridiculous - Yes :) -- The choice of a Deliantra, the free code+content MORPG -----==- _GNU_ http://www.deliantra.net ----==-- _ generation ---==---(_)__ __ ____ __ Marc Lehmann --==---/ / _ \/ // /\ \/ / pcg@goof.com -=====/_/_//_/\_,_/ /_/\_\ From libev at schmorp.de Sat Nov 3 20:32:01 2007 From: libev at schmorp.de (Marc Lehmann) Date: Sat Nov 3 20:32:04 2007 Subject: [Libevent-users] bug in evhttp_write_buffer/weird event_set semantics? Message-ID: <20071104003201.GB25462@schmorp.de> While debugging a problem of http.c with libev, I found this code in evhttp_write_buffer: if (event_pending(&evcon->ev, EV_WRITE|EV_TIMEOUT, NULL)) event_del(&evcon->ev); event_set(&evcon->ev, evcon->fd, EV_WRITE, evhttp_write, evcon); evhttp_add_event(&evcon->ev, evcon->timeout, HTTP_WRITE_TIMEOUT); now, event_set initialises a struct event. the problem with the above code is that it calls event_set on a struct event that is currently in use and expects it to work. this doesn't directly contradict the sparse documentation, but is rather weird. looking at event_set closely, I see that it doesn't initialise the struct event fully (but it does initialiseev_flags). I also saw that some parts of the testsuite are indeed expecting that libevent treats struct events that have never been initialised by an event_set "properly" (= ignoring them). I therefore conclude that there is no function in libevent that actually initialises struct events. This strikes me as a rather weird design, especially as it requires the user of libevent to clear all his struct events before passing it to event_set. It also is not threadsafe, as event_set overwrites ev_base with current_base which might not be the correct one. Note also that some tets programs do not properly clear the event structure they allocate (test-time for example). This makes me wonder about these questions: 1. could anybody confirm wether a user must clear the struct event before passing it to event_set? 2. shouldn't this rather fundamental requirement be documented *somewhere*? 3. is there some guarantee that one can call event_add/del on cleared memory arrays and this will not crash? 4. will the code fragment above not cause an event to be added twice to some lists, as event_set clears the flags that event_add uses to detect wether an event is already on the list? In any case, I can manage in the libev emulation layer by also relying on some "has been initialiased before" flag, but this is of course no guarantee that code like the above will ever work (and I suspect it will not). In general, this whole design strikes me as rather messy. (in libev, you always have to call one of the watcher initialiser macros that do not depend on earlier contents). Thanks a lot for any insights. -- The choice of a Deliantra, the free code+content MORPG -----==- _GNU_ http://www.deliantra.net ----==-- _ generation ---==---(_)__ __ ____ __ Marc Lehmann --==---/ / _ \/ // /\ \/ / pcg@goof.com -=====/_/_//_/\_,_/ /_/\_\ From provos at citi.umich.edu Sat Nov 3 20:40:50 2007 From: provos at citi.umich.edu (Niels Provos) Date: Sat Nov 3 20:41:08 2007 Subject: [Libevent-users] bug in evhttp_write_buffer/weird event_set semantics? In-Reply-To: <20071104003201.GB25462@schmorp.de> References: <20071104003201.GB25462@schmorp.de> Message-ID: <850f7cbe0711031740q6393ec4dy442cee9ab0ebcc8f@mail.gmail.com> Hi Marc, I think the documentation is very clear on this: The function event_set() prepares the event structure ev to be used in future calls to event_add() and event_del(). The event will be prepared to call the function specified by the fn argument with an int argument indicating the file descriptor, a short argument indicating the type of event, and a void * argument given in the arg argument. The fd indicates the file descriptor that should be monitored for events. The events can be either EV_READ, EV_WRITE, or both, indicating that an application can read or write from the file descriptor respectively without blocking. Perhaps you might like to create a libev mailing list and discuss further development of libev there? I find your somewhat discourteous insinuation of bugs distracting. Thank you, Niels. On 11/3/07, Marc Lehmann wrote: > While debugging a problem of http.c with libev, I found this code > in evhttp_write_buffer: > > if (event_pending(&evcon->ev, EV_WRITE|EV_TIMEOUT, NULL)) > event_del(&evcon->ev); > > event_set(&evcon->ev, evcon->fd, EV_WRITE, evhttp_write, evcon); > evhttp_add_event(&evcon->ev, evcon->timeout, HTTP_WRITE_TIMEOUT); > > now, event_set initialises a struct event. the problem with the above code > is that it calls event_set on a struct event that is currently in use and > expects it to work. > > this doesn't directly contradict the sparse documentation, but is rather weird. > > looking at event_set closely, I see that it doesn't initialise the struct event > fully (but it does initialiseev_flags). > > I also saw that some parts of the testsuite are indeed expecting that > libevent treats struct events that have never been initialised by an > event_set "properly" (= ignoring them). > > I therefore conclude that there is no function in libevent that actually > initialises struct events. > > This strikes me as a rather weird design, especially as it requires the user > of libevent to clear all his struct events before passing it to event_set. > > It also is not threadsafe, as event_set overwrites ev_base with current_base > which might not be the correct one. > > Note also that some tets programs do not properly clear the event structure > they allocate (test-time for example). > > This makes me wonder about these questions: > > 1. could anybody confirm wether a user must clear the struct event before > passing it to event_set? > 2. shouldn't this rather fundamental requirement be documented *somewhere*? > 3. is there some guarantee that one can call event_add/del on cleared memory > arrays and this will not crash? > 4. will the code fragment above not cause an event to be added twice to some > lists, as event_set clears the flags that event_add uses to detect wether > an event is already on the list? > > In any case, I can manage in the libev emulation layer by also relying > on some "has been initialiased before" flag, but this is of course no > guarantee that code like the above will ever work (and I suspect it will > not). > > In general, this whole design strikes me as rather messy. > > (in libev, you always have to call one of the watcher initialiser macros > that do not depend on earlier contents). > > Thanks a lot for any insights. > > -- > The choice of a Deliantra, the free code+content MORPG > -----==- _GNU_ http://www.deliantra.net > ----==-- _ generation > ---==---(_)__ __ ____ __ Marc Lehmann > --==---/ / _ \/ // /\ \/ / pcg@goof.com > -=====/_/_//_/\_,_/ /_/\_\ > _______________________________________________ > Libevent-users mailing list > Libevent-users@monkey.org > http://monkey.org/mailman/listinfo/libevent-users > > From provos at citi.umich.edu Sat Nov 3 20:53:42 2007 From: provos at citi.umich.edu (Niels Provos) Date: Sat Nov 3 20:53:45 2007 Subject: [Libevent-users] [PATCH] TAILQ_ENTRY missing in evhttp.h on linux In-Reply-To: <20071104001851.GD2724@ns1.anodized.com> References: <307EFBA19B7B8D41A3E823644410291F30270D@exchange01.alertlogic.net> <850f7cbe0711031656w4d22397dr9d0130d48e7bdd87@mail.gmail.com> <20071104001851.GD2724@ns1.anodized.com> Message-ID: <850f7cbe0711031753p7c081912l8d74ac93f8a0d47@mail.gmail.com> On 11/3/07, Christopher Layne wrote: > Why is this a usercode issue? Shouldn't evhttp.h be more interested in > handling it's dependencies than non-event parent code? It's similar to the > being a usercode, but event.h dependency as well. I mean why > not make the parent code just handle all includes for event.h and evhttp.h > - down to stdint.h while we're at it? Because that would be ridiculous - > and the child header should handle it's own dependencies. I know Rob Pike > may have thought at one time one shouldn't include other include files - > but that was also 1989. Well, Rob Pike has many great insights. He also does not like conditional compilation. However, you are right, evhttp.h could resolve the dependency on queue.h internally. I will look into it. Niels. From libev at schmorp.de Sat Nov 3 20:57:31 2007 From: libev at schmorp.de (Marc Lehmann) Date: Sat Nov 3 20:57:35 2007 Subject: [Libevent-users] bug in evhttp_write_buffer/weird event_set semantics? In-Reply-To: <850f7cbe0711031740q6393ec4dy442cee9ab0ebcc8f@mail.gmail.com> References: <20071104003201.GB25462@schmorp.de> <850f7cbe0711031740q6393ec4dy442cee9ab0ebcc8f@mail.gmail.com> Message-ID: <20071104005731.GB25766@schmorp.de> On Sat, Nov 03, 2007 at 05:40:50PM -0700, Niels Provos wrote: > I think the documentation is very clear on this: I read that part, but the documentation doesn't mention anything about wether one can pass in uninitalised memory (as I explained in my mail). Did you read it? > The function event_set() prepares the event structure ev to be used in > future calls to event_add() and event_del(). The event will be > prepared to call the function specified by the fn argument with an int > argument indicating the file descriptor, a short argument indicating > the type of event, and a void * argument given in the arg argument. > The fd indicates the file descriptor that should be monitored for > events. The events can be either EV_READ, EV_WRITE, or both, > indicating that an application can read or write from the file > descriptor respectively without blocking. Ok, where does it say one can event_add, event_set, and then event_add again? This is what http.c actually does. And where does it say that one can call event_del on a struct event without calling event_set before? This is what other code actually does. > Perhaps you might like to create a libev mailing list and discuss > further development of libev there? It would be off-topic there, as the libev API doesn't suffer from these problems, and its testsuite isn't as buggy. > I find your somewhat discourteous insinuation of bugs distracting. 1. I reported a number of bugs in libevent so far. I am not insinuating, but reporting bugs, to improve libevent. 2. I rewrote the libevent core part to be faster, much more scalable, with less artificial limitations and worked hard to contribute this back to libevent. I don't think reporting bugs (or talking about possible bugs) is discourteous, nor do I think I was in any way discourteous. If you don't want to hear about bugs or improvements for libevent, you can say so, but I think you are treating me rather unfairly, given the amount of work I did. Even if libev never gets integrated into libevent, fixing the bugs I found while trying to get it to run should be of interest to you. Or is it the fact that something came up that is faster and more featureful (and smaller, too) than libevent that distracts you? Fear not, as I still intend to contribute, but I realyl don'T want to be insulted for improving your library. :( > Thank you, > Niels. omg, a top-poster and full-quoter, too. -- The choice of a Deliantra, the free code+content MORPG -----==- _GNU_ http://www.deliantra.net ----==-- _ generation ---==---(_)__ __ ____ __ Marc Lehmann --==---/ / _ \/ // /\ \/ / pcg@goof.com -=====/_/_//_/\_,_/ /_/\_\ From libev at schmorp.de Sat Nov 3 21:00:15 2007 From: libev at schmorp.de (Marc Lehmann) Date: Sat Nov 3 21:00:38 2007 Subject: [Libevent-users] bug in evhttp_write_buffer/weird event_set semantics? In-Reply-To: <20071104005731.GB25766@schmorp.de> References: <20071104003201.GB25462@schmorp.de> <850f7cbe0711031740q6393ec4dy442cee9ab0ebcc8f@mail.gmail.com> <20071104005731.GB25766@schmorp.de> Message-ID: <20071104010015.GC25766@schmorp.de> On Sun, Nov 04, 2007 at 01:57:31AM +0100, Marc Lehmann wrote: > > I find your somewhat discourteous insinuation of bugs distracting. > > 1. I reported a number of bugs in libevent so far. I am not insinuating, > but reporting bugs, to improve libevent. I wanted to mention that I also delivered the fixes to those bugs while reporting them. -- The choice of a Deliantra, the free code+content MORPG -----==- _GNU_ http://www.deliantra.net ----==-- _ generation ---==---(_)__ __ ____ __ Marc Lehmann --==---/ / _ \/ // /\ \/ / pcg@goof.com -=====/_/_//_/\_,_/ /_/\_\ From clayne at anodized.com Sat Nov 3 21:07:16 2007 From: clayne at anodized.com (Christopher Layne) Date: Sat Nov 3 21:07:22 2007 Subject: [Libevent-users] [PATCH] TAILQ_ENTRY missing in evhttp.h on linux In-Reply-To: <850f7cbe0711031753p7c081912l8d74ac93f8a0d47@mail.gmail.com> References: <307EFBA19B7B8D41A3E823644410291F30270D@exchange01.alertlogic.net> <850f7cbe0711031656w4d22397dr9d0130d48e7bdd87@mail.gmail.com> <20071104001851.GD2724@ns1.anodized.com> <850f7cbe0711031753p7c081912l8d74ac93f8a0d47@mail.gmail.com> Message-ID: <20071104010716.GF2724@ns1.anodized.com> On Sat, Nov 03, 2007 at 05:53:42PM -0700, Niels Provos wrote: > Well, Rob Pike has many great insights. He also does not like > conditional compilation. However, you are right, evhttp.h could > resolve the dependency on queue.h internally. I will look into it. Agreed, Rob Pike is true blue - and I also respect him greatly - but what he wrote back then (including the grumble about not using include guards) just doesn't hold today. If one thinks about it in an abstract sense, it's inverting the child-module dependency maintenance onto the parent module. It was also somewhat hypocritical in that it was a form of compiler lexical analyzer pre-optimization in it's own right. -cl From libev at schmorp.de Sat Nov 3 21:09:24 2007 From: libev at schmorp.de (Marc Lehmann) Date: Sat Nov 3 21:09:31 2007 Subject: [Libevent-users] bug in evhttp_write_buffer/weird event_set semantics? In-Reply-To: <20071104010015.GC25766@schmorp.de> References: <20071104003201.GB25462@schmorp.de> <850f7cbe0711031740q6393ec4dy442cee9ab0ebcc8f@mail.gmail.com> <20071104005731.GB25766@schmorp.de> <20071104010015.GC25766@schmorp.de> Message-ID: <20071104010924.GA25876@schmorp.de> On Sun, Nov 04, 2007 at 02:00:15AM +0100, Marc Lehmann wrote: > On Sun, Nov 04, 2007 at 01:57:31AM +0100, Marc Lehmann wrote: > > > I find your somewhat discourteous insinuation of bugs distracting. > > > > 1. I reported a number of bugs in libevent so far. I am not insinuating, > > but reporting bugs, to improve libevent. > > I wanted to mention that I also delivered the fixes to those bugs while > reporting them. And so that this isn't just idle words, here is a list: - debian bug #448165 against libevent (the TAILQ_ENTRY problem just reported) - debian bug #448173 crash bug in evdns_resolve_reverse_ipv6 (with obvious fix) - the header file problems, causing crashes or worse in the testsuite in ABI changes. - the http.c reported and (by now) verified (no fix, but the function should probably just call event_del). Thats not insinuating, these are clear bugs and clear reports. I think you need a serious attitude adjustment if you treat actively contributing people like this :( Doh :( -- The choice of a Deliantra, the free code+content MORPG -----==- _GNU_ http://www.deliantra.net ----==-- _ generation ---==---(_)__ __ ____ __ Marc Lehmann --==---/ / _ \/ // /\ \/ / pcg@goof.com -=====/_/_//_/\_,_/ /_/\_\ From nickm at freehaven.net Sat Nov 3 21:44:17 2007 From: nickm at freehaven.net (Nick Mathewson) Date: Sat Nov 3 21:45:37 2007 Subject: [Libevent-users] test/*.c files use installed header files not the ones from the package In-Reply-To: <850f7cbe0711031618o42a96b0dt232912d533456953@mail.gmail.com> References: <20071103225210.GC11441@schmorp.de> <850f7cbe0711031618o42a96b0dt232912d533456953@mail.gmail.com> Message-ID: <20071104014417.GA2347@totoro.wangafu.net> On Sat, Nov 03, 2007 at 04:18:19PM -0700, Niels Provos wrote: > Hi Marc, > > I appreciate your insights, but your message has nothing to do with > libevent. The make files in libevent use the -I option to provide the > path to the header files. I just tried to test this out, as follows. I built libevent, and then added "#error Foo" to the top of the event.h header in the libevent build directory. I confirmed that there existed valid headers (without the #error directive) in /usr/include and /usr/local/include. Then, I entered the "test" directory and typed "make clean; make" The compiler gave me: gcc -DHAVE_CONFIG_H -I. -I.. -I.. -I../compat -g -O2 -Wall -c test-init.c In file included from test-init.c:23: ../event.h:2:2: error: #error foo The test/test-init.c file does indeed "#include ", but it looks like thanks to the -I.. flag passed to gcc, it is finding the copy of event.h in the parent directory. This seems to confirm to me pretty well that, on my OS at least, things work okay. Marc, have you seen this problem you report in the wild? If so, can you give me some help reproducing it? As far as I can tell, Niels is right above. yrs, -- Nick Mathewson -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 652 bytes Desc: not available Url : http://monkeymail.org/archives/libevent-users/attachments/20071103/4fee2618/attachment.bin From libev at schmorp.de Sat Nov 3 21:57:02 2007 From: libev at schmorp.de (Marc Lehmann) Date: Sat Nov 3 21:57:06 2007 Subject: [Libevent-users] announcing libev, towards a faster and more featureful libevent In-Reply-To: <20071103224539.GA20579@wilbur.25thandClement.com> References: <20071103201507.GA11441@schmorp.de> <20071103224539.GA20579@wilbur.25thandClement.com> Message-ID: <20071104015702.GB25876@schmorp.de> On Sat, Nov 03, 2007 at 03:45:39PM -0700, William Ahern wrote: > A good itch, indeed. I am currently working on integrating all modules from libevent, so it becomes a full libevent replacement (and it already runs all of the testsuite that doesn't require access to internals). > > * there is full support for fork, you can continue to use the event loop > > in the parent and child (or just one of them), even with quirky backends > > such as epoll. > > Curious how you managed to do this. Are you checking the process PID on each > loop? I considered that, but I think its too slow (one also needs to be careful that watchers don't change e.g. epoll state until the getpid check is done), or at leats I think I don't want that speed hit, no matter what. Instead, I make it the users job to actually call me after a fork. I provide three functions: void ev_fork_prepare (void); void ev_fork_parent (void); void ev_fork_child (void); which you cna simply plug into pthread_atfork and it will work. The reason I don't myself is that I don't want to require pthreads just for that, but the perl interface for example does, so perl programs will be safe. I wrote "full support for fork" even though its not automatic because it can be done and is fully supported. With libevent, you can't free the event base in general (program crashes with an assertion has has been documented on this list a number of times). > > * there are two types of timers, based on real time differences and wall > > clock time (cron-like). timers can also be repeating and be reset at > > almost no cost (for idle timeouts used by many network servers). time jumps > > get detected reliably in both directions with or without a monotonic clock. > > But then they're not truly "real-time", no? Within the limits of technology, they are: - timers (based on monotonic time) will time out after "n" seconds (whatever was configured), even if the date resets in between (libevent can do that only for backward time jumps). - periodicals will simply be rescheduled, if a periodic timer is scheduled to fire "at" some point then it will not be affected by the time jump, it will still fire at that point (its more complicated with periodic timers scheduled to repeat, if you schedule a periodic timer to execute on every minute than libev will try to schedule it to occur when time() % 60 == 0, regardless of any time jumps. Of course, detecting and correcting this cnanot be done completely reliable with sub-second precision (there is no API in posix to do that), but with a monotonic clock, libev should manage quite fine to detect even very small time jumps caused by ntpd. (With no monotonic clock its a heuristic, obviously). > > * event watchers can be added and removed at any time (in libevent, > > removing events that are pending can lead to crashes). > > This is news to me. Can you give more detail, maybe with pointers to code? That is how I read the code in event.c on first glance. But in fact, it seems to be safe. I initially thought only removing the current watcher is safe. (I was in fact fooled by some bugs in the testsuite). Sorry for the confusion, I was too busy implementing all the other goodies, and right now I am busy implementing the remaining parts to get 100% event API support. > > * different types of events use different watchers, so you don't have > > to use an i/o event watcher for timeouts, and you can reset timers > > seperately from other types of watchers. Also, watchers are much smaller > > (even the libevent emulation watcher only has about 2/3 of the size of a > > libevent watcher). > > libevnet does this for I/O; timer is always set separately from read/write > events. (Point being, its using libevent.) Even libevent is somewhat fast if you don't combine timeouts and io watchers in the same struct event. But it is of course quite the waste. > > * I added idle watchers, pid watchers and hook watchers into the event loop, > > as is required for integration of other event-based libraries, without > > having to force the use of some construct around event_loop. > > Needing to do an operation on every loop is arguably very rare, and there's > not much burden in rolling your own. Its a quality issue. If you have a program that uses libevent and a library that needs to hook into it, it simply cannot be done. I happen to have many such cases. It basiclaly happens as soon as you use libevent as some part of some big program (such as in form of a perl module :), where many components might want to hook into the event loop. With that functionality in place, you can do it. Without it, you simply fail. It doesn't matter much, as libev is still faster than libevent even with all those watcher types. > PID watchers, likewise... how many spots in the code independently > manage processes (as opposed to one unit which can just catch > SIGCHLD). You could say the same about signals and be right just as well, they are not as important as io and timers. But a high-quality implementation should support these, as those are events, too, and, like select/poll etc. you can't share them within a single process, there must be a single event scheduler for that. So yes, they are not extremely important, but when you need that fucntionality and it isn't there, you simply fail. Again, think about a library hooking into this mechanism. As a practical example, again perl has an AnyEvent module. You can use i/o, timer and signal/child events portably with AnyEvent, and when using it, your software module will magically work under Gtk, libevent, Tk etc. > Also, curious how/if you've considered Win32 environments. I consider them a goal, albeit a low-priority one (until I find need for it myself). If libev code gets integrated into libevent it of course is a must to be portable to this unfortunately important platform. Likewise if somebody would need it and I could make it happen. Fortunately, libevent sets a good precedent and has lots of code. For example, I just wrote a kqueue interface, by looking at the libevent one, which has very important portability hints. Although libev backends are much simpler than libevent backends, I guess its even simpler to compile/debug something that has already been written, so I am confident kqueue will be supported soon. I will also provide generic backends for other platforms when people need them (and can provide me with documentation). win32 is certainly very highon the list. > > * the backends use a much simpler design. unlike in libevent, the code to > > handle events is not duplicated for each backend, backends deal only > > with file descriptor events and a single timeout value, everything else > > is handled by the core, which also optimises state changes (the epoll > > backend is 100 lines in libev, as opposed to >350 lines in libevent, > > without suffering from its limitations). > > libevnet optimizes state changes. Logically every I/O request is single-shot > (which is more forgiving to user code), but it actually sets EV_PERSIST and > delays libevent bookkeeping until the [libevnet bufio] callback returns. If > the user code submits another I/O op from its callback (highly likely) then > the event is left unchanged. It's still re-entrant safe because it can > detect further activity up the call chain using some stack message passing > bits (instead of reference counting because I also use mem pools, but I > digress). Again, point being this can be done using libevent as-is. Yes, in a very awkward way, with high resource wastage. My point being is that if I have a high-performance event loop that claims to support all these fine things than it should really not force it on the user to work around its performance problems. > > As for compatibility, the actual libev api is very different to the > > libevent API (although the design is similar), but there is a emulation > > layer with a corresponding event.h file that supports the event library > > (but no evbuffer, evnds, evhttp etc.). > > Well... if you can persuade me of the utility then this Christmas I might > want to investigate writing an evdns-like component. Well, there is no harm in using the libevent api. In fact, you probably should. The evdns code is not something I really look forward to running it (same with evhttp), but it exists, and is libevent compatible. If you stay within the libevent api, you get the best of all worlds, of course :) This can be seen clearly by the script I use to programmatically import the libevent code into libev: http://data.plan9.de/import_libevent *All* the code changes done there are actually either bugfixes or disabling parts of the testsuite that work around stuff or rely on internal structure members not exposed through the API. And you get almost all the advantages of libev even through the libevent API, and even if you don't separate timers from I/O. And it uses less memory (watcher size ~150 vs. ~120 bytes) and is faster too. And you can always go back to libevent if problems come up! :) If you want to have a look, here is the libev API header file (which I think would make a fine API for libevent, too): http://cvs.schmorp.de/libev/ev.h (all the macro magic exists so one can build libev with single-event-loop and multiple-event-loop interface. It also deals with including the .c files directly in your package, where you can then configure custom fields for your app to be included into all watchers). Here is the perl documentation. It has a slightly diffeernt interface, but it explains some of the rationale behind the timers, watcher lifetime and the other watchers: http://cvs.schmorp.de/EV/README Given that all this is only a few days old, be nice to me :) I expect to have some basiclaly workign libev release to be ready in a few days or so, though. Right now its sitll kind of a battlefield. > See the "lookup" component of libevnet. There are lots of nice things > I need in a DNS resolver that evdns and others are incapable of > handling. And I've also written more HTTP, RTSP, and SOCKS5 parsers than > I can remember. I will. As a minimum, the interface should be able to let me query a few things comfortably and any other type(s) of records with some less ease, and I will buy it immediately :) Try getting it into libevent so I cna pick it up and everybody else can take advantage of it :) > If you ask me, it would prove more fortuitous to re-write the DNS and HTTP > components then to replace libevent. Reason being because it would be hard > to substantively improve on DNS/HTTP without altering the API, whereas > clearly its feasible to improve libevent under the hood without altering the > existing API, and then building your new features on top of this. That's > sort of what I did with libevnet, by adding buffered I/O, DNS, and thread > management API atop libevnet. Yes, I see this now. I originally expected those components to be, well, not that full-features and correct (w.r.t. the protocol), but basically well-tested and working. The problem is that I am prepared to maintainer my event library, but I would really like to have some async dns, too, and the code exists, which is a good argument for using it (it *does* work, after all :) So for me, using all that code brings me a lot of features quickly. I am not conerned about bloat, as one can still include the full libev core without anythign else very easily: #define EV_STANDALONE 1 #define EV_MULTIPLICITY 0 #define EV_USE_EPOLL 1 #include "libev/ev.h" #include "libev/ev.c" > > So, would there be an interest in replacing the "core" event part of > > libevent with the libev code? If yes, there are a number of issues to > > solve, and here is how I would solve them: > > Win32 support is important to me, unfortuantely. As it likely is to others. > libevent has a very, very large installed base of users. Yes, another reason to make a high-quality emulation of its API. Even if it means relying on some bugs. Ok, you convinced me, I will look at how libevent handles win32 and likely adopt its code. It should be realtively cheap, as I already reuse the libevent configure stuff, too and have all the required files. It of course also helps that libev has less confifgure requirements. > > * libev only supports select and epoll. Adding poll would be trivial for me, > > I always thought it would be easier to just create kqueue wrappers around > epoll, poll, select, et al, and then build a library on that. Once you start > adding things like PID events, etc, its at least worth some thought. poll has been added already, btw. And I am not sure what you mean, the epoll backend alone is very simple, as are select and poll (all are aorund 100 bytes), while the kqueue backend is 160 lines already. It seems wrapping around kqueue would only add overhead (conceptually mostly in that you have to program more and more complicated stuff). Here is the epoll backend to illustrate how simple it really is: http://cvs.schmorp.de/libev/ev_epoll.c kqueue in itself doesn't add anything we can already do, too, as libev (just like libevent) has to provide stuff like signals and chld notifications portably, and having just one version of the code is betetr for reducing bugs and maintainance burden. Reminds me, libevent seems to forget about signals in one event loop when another one is running, I need to fix this in libev too, now that I have multiple event loops :/ > So you (1) want to support multiple destination events for the same source > but (2) let multiple threads signal discrete events? That sounds like an > invitation for even more trouble. Now you've got mutexes littered all over > the place. That's my version of a nightmare. And if you group discrete > events into the same thread, then you haven't solved the CPU workload > problem. > > I too prefer a single event base. The obvious problem being when you > actually have CPU intensive work, say an MPEG streaming service--lots of > I/O--which also needs to do intermedate AV processing--lots of CPU. Well, I usualyl solve this by havign a thread pool to do the work and integrating that one into my event loop. Works with any event loop, too, if you use a pipe and then use it sparingly. In any case, this entry is outdated, libev in CVS supports multiple event bases, in a fully compile-time optional way, so one can chose wether to have multiple event bases or have everything as static global variables. And the API is more obvious (for example, there is no event_set that temporairly re-assigns your active event to some other event base, as http.c currently does), but basically all functions that need it have as first argument the event base they need to work on. > Though, I think we've already had this debate... so... I'll just shut-up ;) I'll support it, I'll support it. I admit some of my views are policy and not requirement, and I admit I just didn't want to litter my API with an event_base, but I found a clean way to support both, so I will provide both. Thanks a lot for your input! -- The choice of a Deliantra, the free code+content MORPG -----==- _GNU_ http://www.deliantra.net ----==-- _ generation ---==---(_)__ __ ____ __ Marc Lehmann --==---/ / _ \/ // /\ \/ / pcg@goof.com -=====/_/_//_/\_,_/ /_/\_\ From libev at schmorp.de Sat Nov 3 22:02:02 2007 From: libev at schmorp.de (Marc Lehmann) Date: Sat Nov 3 22:02:08 2007 Subject: [Libevent-users] test/*.c files use installed header files not the ones from the package In-Reply-To: <850f7cbe0711031618o42a96b0dt232912d533456953@mail.gmail.com> References: <20071103225210.GC11441@schmorp.de> <850f7cbe0711031618o42a96b0dt232912d533456953@mail.gmail.com> Message-ID: <20071104020202.GC25876@schmorp.de> On Sat, Nov 03, 2007 at 04:18:19PM -0700, Niels Provos wrote: > I appreciate your insights, but your message has nothing to do with > libevent. The make files in libevent use the -I option to provide the > path to the header files. Well, since you thre documentation at me before, let me just say its high time to actually read up on what those -I switches do to various compilers.... But then, I see you are not really interested in fixing bugs: > I also noticed that you seem to have found several bugs in libevent. And I keep finding more the more I look at it. > It would be nice if you could send patches for them. I did that for all bugs where I could identify the right solution. Most bugs are actually fixed by the event core replacement code I wrote, for the remaining ones I send in detailed instructions how to patch the file. > I am a little bit dubious about some of the claims such as not being > able to remove an event while in a callback. Yes, that seems to be simply wrong, sorry for that. If you keep hitting and finding bugs (or undocumented features, such as not beign able to have multiple event watchers for most events, which is probably not a bug, but certainly something outside of my expectation for an event loop), sometimes one makes a mistake. I am, however, not that thrilled anymore about contributing to libevent after the treatment you give to me, ignoring my patches, and insulting me :( -- The choice of a Deliantra, the free code+content MORPG -----==- _GNU_ http://www.deliantra.net ----==-- _ generation ---==---(_)__ __ ____ __ Marc Lehmann --==---/ / _ \/ // /\ \/ / pcg@goof.com -=====/_/_//_/\_,_/ /_/\_\ From nickm at freehaven.net Sat Nov 3 22:11:53 2007 From: nickm at freehaven.net (Nick Mathewson) Date: Sat Nov 3 22:12:10 2007 Subject: [Libevent-users] test/*.c files use installed header files not the ones from the package In-Reply-To: <20071104020202.GC25876@schmorp.de> References: <20071103225210.GC11441@schmorp.de> <850f7cbe0711031618o42a96b0dt232912d533456953@mail.gmail.com> <20071104020202.GC25876@schmorp.de> Message-ID: <20071104021153.GG20993@totoro.wangafu.net> On Sun, Nov 04, 2007 at 03:02:02AM +0100, Marc Lehmann wrote: > On Sat, Nov 03, 2007 at 04:18:19PM -0700, Niels Provos wrote: > > I appreciate your insights, but your message has nothing to do with > > libevent. The make files in libevent use the -I option to provide the > > path to the header files. > > Well, since you thre documentation at me before, let me just say its high > time to actually read up on what those -I switches do to various > compilers.... > > But then, I see you are not really interested in fixing bugs: I am indeed interested in fixing bugs. So is Niels. I am going through the emails you have sent me in the order I got them, trying to confirm or disconfirm the bugs, and looking for the fixes you made. > > I also noticed that you seem to have found several bugs in libevent. > > And I keep finding more the more I look at it. > > > It would be nice if you could send patches for them. > > I did that for all bugs where I could identify the right solution. Most > bugs are actually fixed by the event core replacement code I wrote, for > the remaining ones I send in detailed instructions how to patch the file. I think that by "patches", Niels means "diffs". I do not see where those are; please forgive me if I have missed them. > > I am a little bit dubious about some of the claims such as not being > > able to remove an event while in a callback. > > Yes, that seems to be simply wrong, sorry for that. If you keep hitting > and finding bugs (or undocumented features, such as not beign able to have > multiple event watchers for most events, which is probably not a bug, but > certainly something outside of my expectation for an event loop), sometimes > one makes a mistake. Right. Anybody can make mistakes. That's how bugs happen. But you can make mistakes too: that means that we need to confirm the bugs that you send in (unit tests and diffs would be ideal) and confirm that the fixes fix the bugs and don't break anything else. I'm sorry that this isn't going as fast as possible. As mentioned earlier, I'm at a party in cape cod this weekend, and right now I'm taking time out to try to help get this stuff going better. > I am, however, not that thrilled anymore about contributing to libevent after > the treatment you give to me, ignoring my patches, and insulting me :( Marc, Niels: There are ways we could all communicate more politely and effectively. Let's take this stuff off-list. It doesn't help make a better libevent to argue about it here. many thanks, and hopes for a better working relationship in the future, -- Nick Mathewson -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 652 bytes Desc: not available Url : http://monkeymail.org/archives/libevent-users/attachments/20071103/944a1871/attachment-0001.bin From libev at schmorp.de Sun Nov 4 11:13:57 2007 From: libev at schmorp.de (Marc Lehmann) Date: Sun Nov 4 11:14:10 2007 Subject: [Libevent-users] sensible thread-safe signal handling proposal In-Reply-To: <20071103224539.GA20579@wilbur.25thandClement.com> References: <20071103201507.GA11441@schmorp.de> <20071103224539.GA20579@wilbur.25thandClement.com> Message-ID: <20071104161357.GA25161@schmorp.de> > On Sat, Nov 03, 2007 at 03:45:39PM -0700, William Ahern wrote: > > Curious how you managed to do this. Are you checking the process PID on each > > loop? > > I considered that, but I think its too slow (one also needs to be careful > that watchers don't change e.g. epoll state until the getpid check is > done), or at leats I think I don't want that speed hit, no matter what. After giving signal handling and threads a lot of thought, I came to these conclusions: - requiring pthreads or windows mutexes by default is not acceptable, but thats the only way to distribute signal events among event loops properly, or globally among many threads if signal handling were global. - the only way to do it without locking is to only allow a single loop to handle events. This is the interface I came up with to manage multiple loops (which I think makes more sense than the interface currently in libevent): struct ev_loop *ev_default_loop (int methods); void ev_default_destroy (void); void ev_default_fork (void); this would create "the default" loop (event_base). ev_default_loop would always create the same loop, and it would be the one to use for third-party libraries in general, too. The fork method can be called in the parent or child (or even in both, or without forking), and it would destroy and recreate the kernel state but keep all the watchers for the default loop. struct ev_loop *ev_loop_new (int methods); void ev_loop_destroy (EV_P); void ev_loop_fork (EV_P); This would create additional loops (event_bases). The difference is that these cannot handle signals (or child watchers) at all, with the default loop being the only one to do signal handling. This would be consistent with how signals are usually handled in a pthreads environment: block signals in all threads and in one thread handle them all (sigwait, or using the default mainloop). No locking inside libevent would be required this way. I'll implement this in my libev replacement code, unless somebody else comes up with a better idea. One such idea that isn't better, but different, would be to require the user to provide mutex support, such as in ev_init_locking (size, init_cb, lock_cb, unlock_cb, free_cb) or similar, then use locking and let any event loop handle the signals and distribute signal events to the relevant loops. But I am not sure how much locking would be required and I assume it would be a lot, as one would need to handle the case where one thread handles a signal for an event_base currently in use by another thread. Looking at the code in libevent, it seems that signals get handled by whatever loop was started last, so signal handling is not reliable at all unless one registers the signal handlers in all threads, which is hard to do in a thread-safe manner (for the user code). Having a deterministic model where one loop handles all that would definitely an improvement over this. -- The choice of a Deliantra, the free code+content MORPG -----==- _GNU_ http://www.deliantra.net ----==-- _ generation ---==---(_)__ __ ____ __ Marc Lehmann --==---/ / _ \/ // /\ \/ / pcg@goof.com -=====/_/_//_/\_,_/ /_/\_\ From sgrimm at facebook.com Sun Nov 4 15:15:56 2007 From: sgrimm at facebook.com (Steven Grimm) Date: Sun Nov 4 15:16:06 2007 Subject: [Libevent-users] sensible thread-safe signal handling proposal In-Reply-To: <20071104161357.GA25161@schmorp.de> References: <20071103201507.GA11441@schmorp.de> <20071103224539.GA20579@wilbur.25thandClement.com> <20071104161357.GA25161@schmorp.de> Message-ID: On Nov 4, 2007, at 8:13 AM, Marc Lehmann wrote: > This would create additional loops (event_bases). The difference is > that > these cannot handle signals (or child watchers) at all, with the > default loop > being the only one to do signal handling. This seems like a totally sane approach to me. Having multiple loops is a big performance win for some applications (e.g., memcached in multithreaded mode), so making the behavior a bit more consistent is a good thing. Now if only there were a way to wake just one thread up when input arrives on a descriptor being monitored by multiple threads... But that isn't supported by any of the underlying poll mechanisms as far as I can tell. -Steve From adrian at creative.net.au Sun Nov 4 18:03:03 2007 From: adrian at creative.net.au (Adrian Chadd) Date: Sun Nov 4 18:00:35 2007 Subject: [Libevent-users] sensible thread-safe signal handling proposal In-Reply-To: References: <20071103201507.GA11441@schmorp.de> <20071103224539.GA20579@wilbur.25thandClement.com> <20071104161357.GA25161@schmorp.de> Message-ID: <20071104230302.GH26480@skywalker.creative.net.au> On Sun, Nov 04, 2007, Steven Grimm wrote: > Now if only there were a way to wake just one thread up when input > arrives on a descriptor being monitored by multiple threads... But > that isn't supported by any of the underlying poll mechanisms as far > as I can tell. Would this be for listen sockets, or for general read/write IO on an FD? Adrian From clayne at anodized.com Sun Nov 4 18:07:03 2007 From: clayne at anodized.com (Christopher Layne) Date: Sun Nov 4 18:07:47 2007 Subject: [Libevent-users] sensible thread-safe signal handling proposal In-Reply-To: References: <20071103201507.GA11441@schmorp.de> <20071103224539.GA20579@wilbur.25thandClement.com> <20071104161357.GA25161@schmorp.de> Message-ID: <20071104230703.GA19235@ns1.anodized.com> On Sun, Nov 04, 2007 at 12:15:56PM -0800, Steven Grimm wrote: > On Nov 4, 2007, at 8:13 AM, Marc Lehmann wrote: > >This would create additional loops (event_bases). The difference is > >that > >these cannot handle signals (or child watchers) at all, with the > >default loop > >being the only one to do signal handling. > > This seems like a totally sane approach to me. Having multiple loops > is a big performance win for some applications (e.g., memcached in > multithreaded mode), so making the behavior a bit more consistent is a > good thing. It's only a performance win when the number of context switches and cache stomping, as a result of multiple threads cycling within their own context does not outweigh the "latency" of a model using less or even 1 thread. Consider a room with 20 people in it and a single door. The goal is to hand them a football as a new football is dropped off the assembly line and have them exit the door. You could throw them all a new football right as it comes off the line and have them immediately rush for the door - resulting in a log jam that one has to stop tending the assembly line to handle. You then head back to the line and begin the patterened task of throwing footballs to workers as fast as you can - only to have the log jam repeat itself. The only way to solve this efficiently is to have less people try and exit the door at once, or add more doors (CPUs). > Now if only there were a way to wake just one thread up when input > arrives on a descriptor being monitored by multiple threads... But > that isn't supported by any of the underlying poll mechanisms as far > as I can tell. > > -Steve It isn't typically supported because it's not a particularly useful or efficient path to head down in the first place. Thread pools being what they are, incredibly useful and pretty much the de facto in threaded code, do have their own abstraction limits as well. Setting up a thread pool, an inherently asynchronous and unordered collection of contexts, to asynchronously process an ordered stream of data (unless your protocol has no "sequence", which I doubt), which I presume to somehow be in the name of performance, is way more complex and troublesome design than it needs to be. It's anchored somewhat to the "every thread can do anything" school of thought which has many hidden costs. The issue in itself is having multiple threads monitor the *same* fd via any kind of wait mechanism. It's short circuiting application layers, so that a thread (*any* thread in that pool) can immediately process new data. I think it would be much more structured, less complex (i.e. better performance in the long run anyways), and a cleaner design to have a set number (or even 1) thread handle the "controller" task of tending to new network events, push them onto a per-connection PDU queue, or pre-process in some form or fashion, condsig, and let previously mentioned thread pool handle it in an ordered fashion. Having a group of threads listening to the same fd has now just thrown our football manager out entirely and become a smash-and-grab for new footballs. There's still the door to get through. -cl From sgrimm at facebook.com Sun Nov 4 18:11:32 2007 From: sgrimm at facebook.com (Steven Grimm) Date: Sun Nov 4 18:11:36 2007 Subject: [Libevent-users] sensible thread-safe signal handling proposal In-Reply-To: <20071104230302.GH26480@skywalker.creative.net.au> References: <20071103201507.GA11441@schmorp.de> <20071103224539.GA20579@wilbur.25thandClement.com> <20071104161357.GA25161@schmorp.de> <20071104230302.GH26480@skywalker.creative.net.au> Message-ID: <33FD98C8-075B-403A-915A-EBCC22F4DD59@facebook.com> On Nov 4, 2007, at 3:03 PM, Adrian Chadd wrote: > On Sun, Nov 04, 2007, Steven Grimm wrote: > >> Now if only there were a way to wake just one thread up when input >> arrives on a descriptor being monitored by multiple threads... But >> that isn't supported by any of the underlying poll mechanisms as far >> as I can tell. > > Would this be for listen sockets, or for general read/write IO on an > FD? Specifically for a mixed TCP- and UDP-based protocol where any thread is equally able to handle an incoming request on the UDP socket, but TCP sockets are bound to particular threads. Unfortunately the vast majority of incoming requests are on the UDP socket, too many to handle on one thread. Before anyone suggests it: a message-passing architecture (one thread reads the UDP socket and queues up work for other threads) gave me measurably higher request-handling latency than the current setup, which works but eats lots of system CPU time because all the threads wake up on each UDP packet. It makes sense: the current scheme involves fewer context switches for a given request (at least, on the thread that ends up handling it), and context switches aren't free. Ideally I'd love a mode where I could say, "Only trigger one of the waiting epoll instances when this descriptor has input available." Sort of pthread_cond_signal() semantics, as opposed to the current pthread_cond_broadcast() semantics. (Yes, I'm aware that pthread_cond_signal() is not *guaranteed* to wake up only one waiting thread -- but in practice that's what it does.) -Steve From adrian at creative.net.au Sun Nov 4 18:20:40 2007 From: adrian at creative.net.au (Adrian Chadd) Date: Sun Nov 4 18:18:03 2007 Subject: [Libevent-users] sensible thread-safe signal handling proposal In-Reply-To: <33FD98C8-075B-403A-915A-EBCC22F4DD59@facebook.com> References: <20071103201507.GA11441@schmorp.de> <20071103224539.GA20579@wilbur.25thandClement.com> <20071104161357.GA25161@schmorp.de> <20071104230302.GH26480@skywalker.creative.net.au> <33FD98C8-075B-403A-915A-EBCC22F4DD59@facebook.com> Message-ID: <20071104232040.GA3367@skywalker.creative.net.au> On Sun, Nov 04, 2007, Steven Grimm wrote: > >Would this be for listen sockets, or for general read/write IO on an > >FD? > > Specifically for a mixed TCP- and UDP-based protocol where any thread > is equally able to handle an incoming request on the UDP socket, but > TCP sockets are bound to particular threads. Makes sense. Doesn't solaris event ports system handle this? I haven't checked in depth. It sounds like something that kqueue could be extended to do relatively easily. What about multiple threads blocking on the same UDP socket? Do multiple threads wake up when IO arrives? Or just one? Adrian From sgrimm at facebook.com Sun Nov 4 18:18:42 2007 From: sgrimm at facebook.com (Steven Grimm) Date: Sun Nov 4 18:18:46 2007 Subject: [Libevent-users] sensible thread-safe signal handling proposal In-Reply-To: <20071104230703.GA19235@ns1.anodized.com> References: <20071103201507.GA11441@schmorp.de> <20071103224539.GA20579@wilbur.25thandClement.com> <20071104161357.GA25161@schmorp.de> <20071104230703.GA19235@ns1.anodized.com> Message-ID: <03765F1F-6EE4-42A0-9CD3-EBA36D5CD804@facebook.com> On Nov 4, 2007, at 3:07 PM, Christopher Layne wrote: > The issue in itself is having multiple threads monitor the *same* fd > via any > kind of wait mechanism. It's short circuiting application layers, so > that a > thread (*any* thread in that pool) can immediately process new data. > I think > it would be much more structured, less complex (i.e. better > performance in > the long run anyways), and a cleaner design to have a set number (or > even > 1) thread handle the "controller" task of tending to new network > events, > push them onto a per-connection PDU queue, or pre-process in some > form or > fashion, condsig, and let previously mentioned thread pool handle it > in an > ordered fashion. You've just pretty accurately described my initial implementation of thread support in memcached. It worked, but it was both more CPU- intensive and had higher response latency (yes, I actually measured it) than the model I'm using now. The only practical downside of my current implementation is that when there is only one UDP packet waiting to be processed, some CPU time is wasted on the threads that don't end up winning the race to read it. But those threads were idle at that instant anyway (or they wouldn't have been in a position to wake up) so, according to my benchmarking, there doesn't turn out to be an impact on latency. And though I am wasting CPU cycles, my total CPU consumption still ends up being lower than passing messages around between threads. It wasn't what I expected; I was fully confident at first that the thread-pool, work-queue model would be the way to go, since it's one I've implemented in many applications in the past. But the numbers said otherwise. -Steve From slamb at slamb.org Sun Nov 4 19:09:08 2007 From: slamb at slamb.org (Scott Lamb) Date: Sun Nov 4 19:09:14 2007 Subject: [Libevent-users] announcing libev, towards a faster and more featureful libevent In-Reply-To: <20071104015702.GB25876@schmorp.de> References: <20071103201507.GA11441@schmorp.de> <20071103224539.GA20579@wilbur.25thandClement.com> <20071104015702.GB25876@schmorp.de> Message-ID: <472E5F24.3060206@slamb.org> Marc Lehmann wrote: >>> * there are two types of timers, based on real time differences and wall >>> clock time (cron-like). timers can also be repeating and be reset at >>> almost no cost (for idle timeouts used by many network servers). time jumps >>> get detected reliably in both directions with or without a monotonic clock. >> But then they're not truly "real-time", no? > > Within the limits of technology, they are: > > - timers (based on monotonic time) will time out after "n" seconds (whatever > was configured), even if the date resets in between (libevent can do that > only for backward time jumps). > > - periodicals will simply be rescheduled, if a periodic timer is scheduled > to fire "at" some point then it will not be affected by the time jump, > it will still fire at that point (its more complicated with periodic > timers scheduled to repeat, if you schedule a periodic timer to execute > on every minute than libev will try to schedule it to occur when time() > % 60 == 0, regardless of any time jumps. > > Of course, detecting and correcting this cnanot be done completely > reliable with sub-second precision (there is no API in posix to do that), > but with a monotonic clock, libev should manage quite fine to detect even > very small time jumps caused by ntpd. > > (With no monotonic clock its a heuristic, obviously). Have you seen the new Linux timerfd API? Where available, you can wait for CLOCK_MONOTONIC and CLOCK_REALTIME events independently. Beats heuristics, and detecting time jumps sound like introducing a lot of extra timeouts. I'd hate to see libev(ent)? show up on PowerTOP after just getting rid of the 5-second timeout. From slamb at slamb.org Sun Nov 4 19:23:01 2007 From: slamb at slamb.org (Scott Lamb) Date: Sun Nov 4 19:23:15 2007 Subject: [Libevent-users] sensible thread-safe signal handling proposal In-Reply-To: <03765F1F-6EE4-42A0-9CD3-EBA36D5CD804@facebook.com> References: <20071103201507.GA11441@schmorp.de> <20071103224539.GA20579@wilbur.25thandClement.com> <20071104161357.GA25161@schmorp.de> <20071104230703.GA19235@ns1.anodized.com> <03765F1F-6EE4-42A0-9CD3-EBA36D5CD804@facebook.com> Message-ID: <472E6265.4090502@slamb.org> Steven Grimm wrote: > On Nov 4, 2007, at 3:07 PM, Christopher Layne wrote: >> The issue in itself is having multiple threads monitor the *same* fd >> via any >> kind of wait mechanism. It's short circuiting application layers, so >> that a >> thread (*any* thread in that pool) can immediately process new data. I >> think >> it would be much more structured, less complex (i.e. better >> performance in >> the long run anyways), and a cleaner design to have a set number (or even >> 1) thread handle the "controller" task of tending to new network events, >> push them onto a per-connection PDU queue, or pre-process in some form or >> fashion, condsig, and let previously mentioned thread pool handle it >> in an >> ordered fashion. > > You've just pretty accurately described my initial implementation of > thread support in memcached. It worked, but it was both more > CPU-intensive and had higher response latency (yes, I actually measured > it) than the model I'm using now. The only practical downside of my > current implementation is that when there is only one UDP packet waiting > to be processed, some CPU time is wasted on the threads that don't end > up winning the race to read it. But those threads were idle at that > instant anyway (or they wouldn't have been in a position to wake up) so, > according to my benchmarking, there doesn't turn out to be an impact on > latency. And though I am wasting CPU cycles, my total CPU consumption > still ends up being lower than passing messages around between threads. > > It wasn't what I expected; I was fully confident at first that the > thread-pool, work-queue model would be the way to go, since it's one > I've implemented in many applications in the past. But the numbers said > otherwise. Thanks for the case study. To rephrase (hopefully correctly), you tried these two models: 1) one thread polls and puts events on a queue; a bunch of other threads pull from the queue. (resulted in high latency, and I'm not too surprised...an extra context switch before handling any events.) 2) a bunch of threads read and handle events independently. (your current model.) Did you also tried the so-called "leader/follower" model, in which the thread which does the polling handles the first event and puts the rest on a queue; another thread takes over polling if otherwise idle while the first thread is still working. My impression this was a widely favored model, though I don't know the details of where each performs best. From william at 25thandClement.com Sun Nov 4 19:48:18 2007 From: william at 25thandClement.com (William Ahern) Date: Sun Nov 4 19:48:41 2007 Subject: [Libevent-users] sensible thread-safe signal handling proposal In-Reply-To: <03765F1F-6EE4-42A0-9CD3-EBA36D5CD804@facebook.com> References: <20071103201507.GA11441@schmorp.de> <20071103224539.GA20579@wilbur.25thandClement.com> <20071104161357.GA25161@schmorp.de> <20071104230703.GA19235@ns1.anodized.com> <03765F1F-6EE4-42A0-9CD3-EBA36D5CD804@facebook.com> Message-ID: <20071105004818.GA14029@wilbur.25thandClement.com> On Sun, Nov 04, 2007 at 03:18:42PM -0800, Steven Grimm wrote: > You've just pretty accurately described my initial implementation of > thread support in memcached. It worked, but it was both more CPU- > intensive and had higher response latency (yes, I actually measured > it) than the model I'm using now. The only practical downside of my > current implementation is that when there is only one UDP packet > waiting to be processed, some CPU time is wasted on the threads that > don't end up winning the race to read it. But those threads were idle > at that instant anyway (or they wouldn't have been in a position to > wake up) so, according to my benchmarking, there doesn't turn out to > be an impact on latency. And though I am wasting CPU cycles, my total > CPU consumption still ends up being lower than passing messages around > between threads. > Is this on Linux? They addressed the stampeding herd problem years ago. If you dig deep down in the kernel you'll see their waitq implemention for non-blocking socket work (and lots of other stuff). Only one thread is ever woken per event. From slamb at slamb.org Sun Nov 4 20:04:25 2007 From: slamb at slamb.org (Scott Lamb) Date: Sun Nov 4 20:04:42 2007 Subject: [Libevent-users] announcing libev, towards a faster and more featureful libevent In-Reply-To: <20071103201507.GA11441@schmorp.de> References: <20071103201507.GA11441@schmorp.de> Message-ID: <472E6C19.5060502@slamb.org> Marc Lehmann wrote: > Hi! > > On tuesday, I sent mail about various problems with libevent and its > current API as well as implementation. Unfortunately, the mail has not yet > shown up, but fortunately, it has been superseded by this one :) > > In that mail, I announced that I will work on the problems I encountered > in libevent (many of which have been reported and discusssed earlier on > this list). After analyzing libevent I decided that it wasn't fixable > except by rewriting the core parts of it (the inability to have multiple > watchers for the same file descriptor event turned out to be blocking for > my applications, otherwise I wouldn't have started the effort in the first > place...). > > The results look promising so far: I additionally implemented a libevent > compatibility layer and benchmarked both libraries using the benchmark > program provided by libevent: http://libev.schmorp.de/bench.html > > Here is an incomplete list of what I changed and added (see the full > list at http://cvs.schmorp.de/libev/README, or the cvs repository at > http://cvs.schmorp.de/libev/): > > fixed or improved: > > * multiple watchers can wait for the same event, there is no limitation > to one or two watchers for signals and io. Could you give me an example of where that is important? > * there is full support for fork, you can continue to use the event loop > in the parent and child (or just one of them), even with quirky backends > such as epoll. Nice; seems like that's come up on the list several times. > * there are two types of timers, based on real time differences and wall > clock time (cron-like). timers can also be repeating and be reset at > almost no cost (for idle timeouts used by many network servers). time jumps > get detected reliably in both directions with or without a monotonic clock. (See my other mail about Linux's new timerfd facility.) Nice; repeating and absolute timers have come up several times before, too. > * timers are managed by a priority queue (O(1) for important operations > as opposed to O(log n) in libevent, also resulting in much simpler code). In terms of concrete data types, you appear to have used a binary heap? So by "important operations" you mean removal, correct? Insertion is still O(log n)? The asymptotic behavior is no different, then, as insertion happens at least as often as removal. > * event watchers can be added and removed at any time (in libevent, > removing events that are pending can lead to crashes). (They don't lead to crashes, as someone mentioned.) > * different types of events use different watchers, so you don't have > to use an i/o event watcher for timeouts, and you can reset timers > seperately from other types of watchers. Also, watchers are much smaller > (even the libevent emulation watcher only has about 2/3 of the size of a > libevent watcher). Nice; separate types can be nice for documentation purposes if nothing else. > > * I added idle watchers, pid watchers and hook watchers into the event loop, > as is required for integration of other event-based libraries, without > having to force the use of some construct around event_loop. Pardon my ignorance, but what are hook watchers? pid watchers I assume to be a fancy SIGCHLD handler? That's a potentially useful feature, but why would it require a construct around event_loop? > * the backends use a much simpler design. unlike in libevent, the code to > handle events is not duplicated for each backend, backends deal only > with file descriptor events and a single timeout value, everything else > is handled by the core, which also optimises state changes (the epoll > backend is 100 lines in libev, as opposed to >350 lines in libevent, > without suffering from its limitations). Nice. > As for compatibility, the actual libev api is very different to the > libevent API (although the design is similar), but there is a emulation > layer with a corresponding event.h file that supports the event library > (but no evbuffer, evnds, evhttp etc.). I think the API needs more hashing out. It is...different...but I'm not sure it's necessarily better, and I don't like change for change's sake. A few notes: * what is the purpose of EV_COMMON? From first glance, I'm concerned that it could not be used properly unless libev.so and all callers are compiled with the same flags, which seems impractical if the library ever gains wide use. * on ev_once failure, you're calling the callback with EV_ERROR? Yuck. That's quite surprising behavior, and I could see it leading to stack overflows as each ev_once tries to issue another one. * What's your use case for ev_loop_new() and ev_loop_default()'s bitmask of allowed implementations? * (again, just skimming) you're closing fds automatically on ENOMEM? Ergh. That seems rather undesirable for many applications. Cheers, Scott From clayne at anodized.com Sun Nov 4 20:19:36 2007 From: clayne at anodized.com (Christopher Layne) Date: Sun Nov 4 20:19:40 2007 Subject: [Libevent-users] announcing libev, towards a faster and more featureful libevent In-Reply-To: <472E6C19.5060502@slamb.org> References: <20071103201507.GA11441@schmorp.de> <472E6C19.5060502@slamb.org> Message-ID: <20071105011936.GB19235@ns1.anodized.com> On Sun, Nov 04, 2007 at 05:04:25PM -0800, Scott Lamb wrote: > > * timers are managed by a priority queue (O(1) for important operations > > as opposed to O(log n) in libevent, also resulting in much simpler code). > > In terms of concrete data types, you appear to have used a binary heap? > So by "important operations" you mean removal, correct? Insertion is > still O(log n)? The asymptotic behavior is no different, then, as > insertion happens at least as often as removal. Not to put on my O-face, but binary heap insert is *average* O(1). There should be a performance win for libevent, when it comes to timer checking, as using a heap will also be O(1) for heap_min() - which will benefit pending timer calculation code. However, early delete w/ pending timer will need some rigging or tombstone games. I wouldn't be surprised that, in a case where one is consistently resetting timers (think read/write before x time passes) and re-adding said event, that in the end it will be the same amount of CPU time. -cl From libev at schmorp.de Sun Nov 4 20:36:27 2007 From: libev at schmorp.de (Marc Lehmann) Date: Sun Nov 4 20:36:35 2007 Subject: [Libevent-users] announcing libev, towards a faster and more featureful libevent In-Reply-To: <472E6C19.5060502@slamb.org> References: <20071103201507.GA11441@schmorp.de> <472E6C19.5060502@slamb.org> Message-ID: <20071105013627.GA763@schmorp.de> On Sun, Nov 04, 2007 at 05:04:25PM -0800, Scott Lamb wrote: > > * multiple watchers can wait for the same event, there is no limitation > > to one or two watchers for signals and io. > > Could you give me an example of where that is important? Mostly in environments using some form garbage collection. For example, this idiom is common in Perl: $event = EV::io $fd, ... If $event happens to contain an old watcher and $fd happens to refer to the same fd as that old watcher, this will lead into probelms as the watchers are both alive for a short time. There is actually a lot of code that relies on this just working, and the only other event loop I know that has a problem with this is Tk. > > * there are two types of timers, based on real time differences and wall > > clock time (cron-like). timers can also be repeating and be reset at > > almost no cost (for idle timeouts used by many network servers). time jumps > > get detected reliably in both directions with or without a monotonic clock. > > (See my other mail about Linux's new timerfd facility.) (timerfd unfortunately makes little sense for this, as it adds overhead but I can't see the compelling advantage, as one will still run into the same time jump problems with periodic timers). > Nice; repeating and absolute timers have come up several times before, too. This was something I always missed in event loops. That is, some event loops have one timer type, some the other, but never both. > > * timers are managed by a priority queue (O(1) for important operations > > as opposed to O(log n) in libevent, also resulting in much simpler code). > > In terms of concrete data types, you appear to have used a binary heap? > So by "important operations" you mean removal, correct? removal: O(log n) insertion: O(log n) find next: O(1) > still O(log n)? The asymptotic behavior is no different, then, as > insertion happens at least as often as removal. Yes, but: a) finding the next timwer is a constant time issue b) a red-black tree is more than three times as slow (see the updated benchmark at http://libev.schmorp.de/bench.html, especially the difference between the first (no timers) and the second examples (timers in use)) > > * I added idle watchers, pid watchers and hook watchers into the event loop, > > as is required for integration of other event-based libraries, without > > having to force the use of some construct around event_loop. > > Pardon my ignorance, but what are hook watchers? if you want to plug-in other event-based libraries into the event loop you need to get to be able to hook into the event loop. this is what those watcher types provide. the alternative would be to write your own event_loop with EV_NONBLOCK, but that isn't modular, that is, if you have two difefernt sofwtare modules having their own event_loop you *must* use, you lose. prepare/check watchers use this problem nicely. A number of event loops have them, and they are useful for other things, such as transpoarently integrating coroutine packages etc. Its not a killer feature, just very very useful in some cases. > pid watchers I assume to be a fancy SIGCHLD handler? Yes. > That's a potentially useful feature, but why would it require a > construct around event_loop? I don't udnerstand that, there is no construct around event_loop, its handled completely seperately. The reason is exists is allowing to share this potentially unsharable resource. For example, poll and select let you do "everything" (with fds), but you can of course only have one component per (single-thread) process using them, as they are blocking. The same thing is true for signals: you can't share them with sigaction, as sigaction only allows one user. And the same thing is true for sigchld. If your event loop provides support for it, you will less likely run into a situation where two sofwtare packages in the same process need access to it and stomp over each other. > > * the backends use a much simpler design. unlike in libevent, the code to > > handle events is not duplicated for each backend, backends deal only > > with file descriptor events and a single timeout value, everything else > > is handled by the core, which also optimises state changes (the epoll > > backend is 100 lines in libev, as opposed to >350 lines in libevent, > > without suffering from its limitations). > > Nice. And while investigating the WIN32-Code/win32.c libevent backend, I found out that its just a glorified variant of the select backend, except its O(n) in registering and deregistering. > > As for compatibility, the actual libev api is very different to the > > libevent API (although the design is similar), but there is a emulation > > layer with a corresponding event.h file that supports the event library > > (but no evbuffer, evnds, evhttp etc.). > > I think the API needs more hashing out. It is...different...but I'm not > sure it's necessarily better, and I don't like change for change's sake. There has been no change for changes sake, I can explain the rationale behind each and every change (I hope :). > A few notes: > > * what is the purpose of EV_COMMON? Allowing customised event watchers. If you are concerned, treat it as a an internal symbol. Its use is documented here: http://cvs.schmorp.de/libev/README.embed > From first glance, I'm concerned that it could not be used properly > unless libev.so and all callers are compiled with the same flags, which > seems impractical if the library ever gains wide use. This is true, but its an optional feature you don't have to use. In case you wonder, EV, the perl interface to libev, uses this feature. It makes most sense when embedding, of course (not all the world is an .so :). > * on ev_once failure, you're calling the callback with EV_ERROR? Yuck. > That's quite surprising behavior, and I could see it leading to stack > overflows as each ev_once tries to issue another one. All callbacks will be called with EV_ERROR when an error occurs. And yes, if you don't do error handlign and endlessly retry the same operation in a loop, you run into problems. But as that is an obvious programming bug, I don't see any problem here. Besides, if you cannot malloc the few bytes ev_once requires you need a *lot* of good error handlign code to continue sensibly. > * What's your use case for ev_loop_new() and ev_loop_default()'s bitmask > of allowed implementations? libevents unconditional use of getenv raised concerns with me and apperently some users on this list, too, so this is one way to disable this (EVMETHOD_ANY instead of EVMETHOD_ALL). Also, I am sure some apps want control over the allowed event loops, e.g. to rule out select becasue it is known to be not wrokign for them. > * (again, just skimming) you're closing fds automatically on ENOMEM? > Ergh. There is little else to do. This isn't malloc or so, but a kernel interface, and usually due to hard limits (not really out of memory). Point being, libev(ent) cannot continue in this condition, there cannot be any progress. Closing the fd and signalling the relevant part of the application is in no way different then a network overload or problem resulting in the same condition. If the app cannot handle that, deep shit. > That seems rather undesirable for many applications. Well, its arguably better than libevents behaviour, which is simply returning from event_loop, leaving the app unclear on what has happened and what to do. In any case, you can get the same behaviour as libevent by calling unloop in case of an error, so the interface is strictly more powerful. Thanks a lot for your questions, I hope I could clarify some things and design decisions. Its indeed not easy to get everything right, and I am sure the ev.h API can get improvements. In some cases there have been design trade-offs (no mutexes leading to less automatic management for example). I do think the design is useful in practise, where error handling is rarely done to the utmost extent and sensible behaviour in dead-end situations counts a lot. -- The choice of a Deliantra, the free code+content MORPG -----==- _GNU_ http://www.deliantra.net ----==-- _ generation ---==---(_)__ __ ____ __ Marc Lehmann --==---/ / _ \/ // /\ \/ / pcg@goof.com -=====/_/_//_/\_,_/ /_/\_\ From libev at schmorp.de Sun Nov 4 20:42:16 2007 From: libev at schmorp.de (Marc Lehmann) Date: Sun Nov 4 20:42:22 2007 Subject: [Libevent-users] announcing libev, towards a faster and more featureful libevent In-Reply-To: <20071105011936.GB19235@ns1.anodized.com> References: <20071103201507.GA11441@schmorp.de> <472E6C19.5060502@slamb.org> <20071105011936.GB19235@ns1.anodized.com> Message-ID: <20071105014216.GB763@schmorp.de> On Sun, Nov 04, 2007 at 05:19:36PM -0800, Christopher Layne wrote: > Not to put on my O-face, but binary heap insert is *average* O(1). There > should be a performance win for libevent, when it comes to timer checking, > as using a heap will also be O(1) for heap_min() - which will benefit pending > timer calculation code. However, early delete w/ pending timer will need some > rigging or tombstone games. I wouldn't be surprised that, in a case where one > is consistently resetting timers (think read/write before x time passes) and > re-adding said event, that in the end it will be the same amount of CPU time. No, because a red-black tree is much more complex in management, so even if both operations are O(log n), the heap usually wins hands down. Both insertion and removal are of the same complexity, on average, in a heap, of the data is random. However, libev has an interface (ev_timer_again) that incrementally updates the heap. Also, for repeating timers in general, there is no removal/insertion but only an incremental update. Regarding pending events, this is solved in the same way for all events (not unlike how libevent does it): There is only one place where a pending event can be, and that is on its associated pending list. When an event gets stopped, and is found pending, it will be removed form the pending queue by nulling out its pointer. The heap insertion/removal is trivial. (Most of the benchmark differences are, in fact, due to the heap vs. rb-tree). -- The choice of a Deliantra, the free code+content MORPG -----==- _GNU_ http://www.deliantra.net ----==-- _ generation ---==---(_)__ __ ____ __ Marc Lehmann --==---/ / _ \/ // /\ \/ / pcg@goof.com -=====/_/_//_/\_,_/ /_/\_\ From libev at schmorp.de Sun Nov 4 20:46:36 2007 From: libev at schmorp.de (Marc Lehmann) Date: Sun Nov 4 20:46:42 2007 Subject: [Libevent-users] announcing libev, towards a faster and more featureful libevent In-Reply-To: <472E5F24.3060206@slamb.org> References: <20071103201507.GA11441@schmorp.de> <20071103224539.GA20579@wilbur.25thandClement.com> <20071104015702.GB25876@schmorp.de> <472E5F24.3060206@slamb.org> Message-ID: <20071105014636.GC763@schmorp.de> On Sun, Nov 04, 2007 at 04:09:08PM -0800, Scott Lamb wrote: > Have you seen the new Linux timerfd API? Where available, you can wait > for CLOCK_MONOTONIC and CLOCK_REALTIME events independently. Beats > heuristics, How? I still need to detect time jumps. If my ev_periodic is to be scheduled every minute, on the minute, and somebody resets the time the timer needs to be rescheduled. With timerfd I would need to detetc that and remove/insert the timer again. (You might have no use for periodics for timeouts, but they are designed to solve this very problem :) Besides, having a syscall per timer (or even timer change) would be an enourmous overhead for many workloads. > and detecting time jumps sound like introducing a lot of extra > timeouts. I don't quite see how that would happen with either timer event currently in libev, unless the user code forces it. > I'd hate to see libev(ent)? show up on PowerTOP after just getting rid > of the 5-second timeout. Now idea what powertop is, but sporiadic servers might use a lot of cpu without the kernel ever realising it :) -- The choice of a Deliantra, the free code+content MORPG -----==- _GNU_ http://www.deliantra.net ----==-- _ generation ---==---(_)__ __ ____ __ Marc Lehmann --==---/ / _ \/ // /\ \/ / pcg@goof.com -=====/_/_//_/\_,_/ /_/\_\ From clayne at anodized.com Sun Nov 4 21:00:56 2007 From: clayne at anodized.com (Christopher Layne) Date: Sun Nov 4 21:01:04 2007 Subject: [Libevent-users] announcing libev, towards a faster and more featureful libevent In-Reply-To: <20071105014216.GB763@schmorp.de> References: <20071103201507.GA11441@schmorp.de> <472E6C19.5060502@slamb.org> <20071105011936.GB19235@ns1.anodized.com> <20071105014216.GB763@schmorp.de> Message-ID: <20071105020056.GC19235@ns1.anodized.com> On Mon, Nov 05, 2007 at 02:42:16AM +0100, Marc Lehmann wrote: > On Sun, Nov 04, 2007 at 05:19:36PM -0800, Christopher Layne wrote: > > Not to put on my O-face, but binary heap insert is *average* O(1). There > > should be a performance win for libevent, when it comes to timer checking, > > as using a heap will also be O(1) for heap_min() - which will benefit pending > > timer calculation code. However, early delete w/ pending timer will need some > > rigging or tombstone games. I wouldn't be surprised that, in a case where one > > is consistently resetting timers (think read/write before x time passes) and > > re-adding said event, that in the end it will be the same amount of CPU time. > > No, because a red-black tree is much more complex in management, so even if > both operations are O(log n), the heap usually wins hands down. > > Both insertion and removal are of the same complexity, on average, in a > heap, of the data is random. > > However, libev has an interface (ev_timer_again) that incrementally > updates the heap. Also, for repeating timers in general, there is no > removal/insertion but only an incremental update. Right, which is not due to an inherent advantage of heap vs rbtree - but due to our luck in time always going in one direction. I believe similar code was present in libevent as it was. This in itself should be no different. > Regarding pending events, this is solved in the same way for all events > (not unlike how libevent does it): There is only one place where a pending > event can be, and that is on its associated pending list. > > When an event gets stopped, and is found pending, it will be removed form > the pending queue by nulling out its pointer. My point was that an event_del() on an event which has been called before it's timer has expired *or* an event_add() with a new timer will require reheapifying atleast part of the timer heap. Having an intrinsically sorted collection of elements and then altering a value within the middle of said collection before it has sifted to the top will require a reheap from that point on. Which isn't really that big a deal as similar time is spent in the present RB implementation as it is. I'm all for a binary-heap rather than a RB-tree personally. I think the performance will benefit primarily for heap_min() (which is done before every re-entry into the event backend to reset the max-wait timer for epoll/poll/select, etc). -cl From libev at schmorp.de Sun Nov 4 21:13:12 2007 From: libev at schmorp.de (Marc Lehmann) Date: Sun Nov 4 21:13:41 2007 Subject: [Libevent-users] announcing libev, towards a faster and more featureful libevent In-Reply-To: <20071105013627.GA763@schmorp.de> References: <20071103201507.GA11441@schmorp.de> <472E6C19.5060502@slamb.org> <20071105013627.GA763@schmorp.de> Message-ID: <20071105021312.GF763@schmorp.de> On Mon, Nov 05, 2007 at 02:36:27AM +0100, Marc Lehmann wrote: > > > * multiple watchers can wait for the same event, there is no limitation > > > to one or two watchers for signals and io. > > > > Could you give me an example of where that is important? > > There is actually a lot of code that relies on this just working, and the > only other event loop I know that has a problem with this is Tk. I forgot to mention that the resulting code is likely a tiny bit faster, and certainly way less complex, then the multiple-case logic employed in some libevent backends: /* detecting wether evread or evwrite are wanted is not shown */ if (evread != NULL && !(evread->ev_events & EV_PERSIST)) event_del(evread); if (evwrite != NULL && evwrite != evread && !(evwrite->ev_events & EV_PERSIST)) event_del(evwrite); this juggling of two slots for read/write didn't feel right. The code to check which watchers want which events and schedule them in ev basically looks like this: for (w = anfd->head; w; w = ((WL)w)->next) if ((ev = w->events & events)) event (EV_A_ (W)w, ev); Also, some backends do reference counting in libevent, some don't, and I don't like completely different semantics unless there is a good technical reason for them (epoll cannot easily detect closed fds, for example, a big problem, but at least its something that cnanot easily be improved upon). The goal obviously wasn't to make this ultra-efficient (its a singly-linked list), but to make it possible on a small scale without mysteriously failing on some backends. -- The choice of a Deliantra, the free code+content MORPG -----==- _GNU_ http://www.deliantra.net ----==-- _ generation ---==---(_)__ __ ____ __ Marc Lehmann --==---/ / _ \/ // /\ \/ / pcg@goof.com -=====/_/_//_/\_,_/ /_/\_\ From libev at schmorp.de Sun Nov 4 21:20:14 2007 From: libev at schmorp.de (Marc Lehmann) Date: Sun Nov 4 21:20:16 2007 Subject: [Libevent-users] announcing libev, towards a faster and more featureful libevent In-Reply-To: <20071105014216.GB763@schmorp.de> References: <20071103201507.GA11441@schmorp.de> <472E6C19.5060502@slamb.org> <20071105011936.GB19235@ns1.anodized.com> <20071105014216.GB763@schmorp.de> Message-ID: <20071105022014.GG763@schmorp.de> On Mon, Nov 05, 2007 at 02:42:16AM +0100, Marc Lehmann wrote: > However, libev has an interface (ev_timer_again) that incrementally > updates the heap. Also, for repeating timers in general, there is no > removal/insertion but only an incremental update. Oh, and sorry for always forgetting stuff... the rationale for supporting this operation is that I think its pretty important to support timers that get reset all the time without usually firing (e.g. idle read timeouts on a normally busy tcp connection). The other rationale is that its trivial to implement with a heap, because you already have all the code to do it: /* incremental timer update in ev_timer_again: */ w->at = mn_now + w->repeat; downheap (timers, timercnt, w->active - 1); /* compare with timer removal: */ timers [w->active - 1] = timers [--timercnt]; downheap (timers, timercnt, w->active - 1); In such a case (updating a timer) the event will simply wander down from current place in the heap to its new place. I am not sure wether this can be done with an rb-tree (likely), but I am sure that I do not want to have to maintain the code that does that :) (In any case, see the timer benchmark for a good comparison of heap vs. red-black-tree). -- The choice of a Deliantra, the free code+content MORPG -----==- _GNU_ http://www.deliantra.net ----==-- _ generation ---==---(_)__ __ ____ __ Marc Lehmann --==---/ / _ \/ // /\ \/ / pcg@goof.com -=====/_/_//_/\_,_/ /_/\_\ From libev at schmorp.de Sun Nov 4 21:29:34 2007 From: libev at schmorp.de (Marc Lehmann) Date: Sun Nov 4 21:29:37 2007 Subject: [Libevent-users] announcing libev, towards a faster and more featureful libevent In-Reply-To: <20071105020056.GC19235@ns1.anodized.com> References: <20071103201507.GA11441@schmorp.de> <472E6C19.5060502@slamb.org> <20071105011936.GB19235@ns1.anodized.com> <20071105014216.GB763@schmorp.de> <20071105020056.GC19235@ns1.anodized.com> Message-ID: <20071105022934.GH763@schmorp.de> On Sun, Nov 04, 2007 at 06:00:56PM -0800, Christopher Layne wrote: > My point was that an event_del() on an event which has been called before it's > timer has expired *or* an event_add() with a new timer will require reheapifying > atleast part of the timer heap. Hmm, I do not see why this would ever be the case. Removing a timer that hasn't expired yet might actually be much cheaper than removing the one at the top element because it isn't at the root, so the n in the O(log n) is potentially much smaller. > Having an intrinsically sorted collection of elements and then altering > a value within the middle of said collection before it has sifted to the > top will require a reheap from that point on. Not sure what you mean with "re-heap", but the opertaion you describe is the same O(log n) operation as for removing elements elsewhere in the heap. And given that you take the latets timer and insert it at that point, removing a timer that hasn't expired is usually faster. > Which isn't really that big a deal as similar time is spent in the present RB > implementation as it is. No, I still maintain that the RB tree is slower because its rebalancing operations are frequent and very complex. Heap code is trivial. Yes, they have the same asymptotic growth behaviour, but the practical cases are all very far away from infinity, and the hidden C in O(log n) is quite important. (Again, please refer to the benchmark at http://libev.schmorp.de/bench.html which directly contrasts behaviour of libevent and libev with timers and no timers, especially look at the difference in runtime when timers are being used). > I'm all for a binary-heap rather than a RB-tree personally. I think the > performance will benefit primarily for heap_min() (which is done before every > re-entry into the event backend to reset the max-wait timer for epoll/poll/select, > etc). I thought so, too, until recently but in fact the event loop is run pretty rarely (except in benchmarks), and if you handle hundreds of clients within each run (very typical of busy servers), then you can have hundreds of timer updates, and these do show up in timing measurements. -- The choice of a Deliantra, the free code+content MORPG -----==- _GNU_ http://www.deliantra.net ----==-- _ generation ---==---(_)__ __ ____ __ Marc Lehmann --==---/ / _ \/ // /\ \/ / pcg@goof.com -=====/_/_//_/\_,_/ /_/\_\ From clayne at anodized.com Sun Nov 4 21:30:10 2007 From: clayne at anodized.com (Christopher Layne) Date: Sun Nov 4 21:30:14 2007 Subject: [Libevent-users] announcing libev, towards a faster and more featureful libevent In-Reply-To: <20071105014636.GC763@schmorp.de> References: <20071103201507.GA11441@schmorp.de> <20071103224539.GA20579@wilbur.25thandClement.com> <20071104015702.GB25876@schmorp.de> <472E5F24.3060206@slamb.org> <20071105014636.GC763@schmorp.de> Message-ID: <20071105023010.GD19235@ns1.anodized.com> On Mon, Nov 05, 2007 at 02:46:36AM +0100, Marc Lehmann wrote: > On Sun, Nov 04, 2007 at 04:09:08PM -0800, Scott Lamb wrote: > > Have you seen the new Linux timerfd API? Where available, you can wait > > for CLOCK_MONOTONIC and CLOCK_REALTIME events independently. Beats > > heuristics, > > How? I still need to detect time jumps. If my ev_periodic is to be scheduled > every minute, on the minute, and somebody resets the time the timer needs to > be rescheduled. With timerfd I would need to detetc that and remove/insert > the timer again. > > (You might have no use for periodics for timeouts, but they are designed > to solve this very problem :) timerfd() has good and redundant points... as far as I can tell, it's an inversion of user<>kernel code that results in the same goal. http://lwn.net/Articles/245688/ -cl From william at 25thandClement.com Sun Nov 4 22:56:36 2007 From: william at 25thandClement.com (William Ahern) Date: Sun Nov 4 22:56:39 2007 Subject: [Libevent-users] announcing libev, towards a faster and more featureful libevent In-Reply-To: <20071105022934.GH763@schmorp.de> References: <20071103201507.GA11441@schmorp.de> <472E6C19.5060502@slamb.org> <20071105011936.GB19235@ns1.anodized.com> <20071105014216.GB763@schmorp.de> <20071105020056.GC19235@ns1.anodized.com> <20071105022934.GH763@schmorp.de> Message-ID: <20071105035636.GA27554@wilbur.25thandClement.com> On Mon, Nov 05, 2007 at 03:29:34AM +0100, Marc Lehmann wrote: > On Sun, Nov 04, 2007 at 06:00:56PM -0800, Christopher Layne wrote: > > Which isn't really that big a deal as similar time is spent in the present RB > > implementation as it is. > > No, I still maintain that the RB tree is slower because its rebalancing > operations are frequent and very complex. Heap code is trivial. Yes, they > have the same asymptotic growth behaviour, but the practical cases are > all very far away from infinity, and the hidden C in O(log n) is quite > important. > RB balancing isn't that complex. Maybe you're thinking of AVL trees? The problem with using heaps in network software is you must be careful adversaries cannot dictate any of the parameters. Certainly when you're dealing with timers triggered by I/O latencies you've at least theoretically exposed yourself to complexity attacks. (This is a guess based on intuition; I haven't yet looked at your code.) Nice thing about RB trees is you get a cap on worst case performance, smooth cost spreading (i.e. no rehashing; not that it would be needed here), and don't have to worry about pathological or malicious scenarios. Sure you pay a small price. But for peace of mind its well worth it. From william at 25thandClement.com Sun Nov 4 22:59:03 2007 From: william at 25thandClement.com (William Ahern) Date: Sun Nov 4 22:59:05 2007 Subject: [Libevent-users] announcing libev, towards a faster and more featureful libevent In-Reply-To: <20071105035636.GA27554@wilbur.25thandClement.com> References: <20071103201507.GA11441@schmorp.de> <472E6C19.5060502@slamb.org> <20071105011936.GB19235@ns1.anodized.com> <20071105014216.GB763@schmorp.de> <20071105020056.GC19235@ns1.anodized.com> <20071105022934.GH763@schmorp.de> <20071105035636.GA27554@wilbur.25thandClement.com> Message-ID: <20071105035902.GB27554@wilbur.25thandClement.com> On Sun, Nov 04, 2007 at 07:56:36PM -0800, William Ahern wrote: > On Mon, Nov 05, 2007 at 03:29:34AM +0100, Marc Lehmann wrote: > > On Sun, Nov 04, 2007 at 06:00:56PM -0800, Christopher Layne wrote: > > > > Which isn't really that big a deal as similar time is spent in the present RB > > > implementation as it is. > > > > No, I still maintain that the RB tree is slower because its rebalancing > > operations are frequent and very complex. Heap code is trivial. Yes, they > > have the same asymptotic growth behaviour, but the practical cases are > > all very far away from infinity, and the hidden C in O(log n) is quite > > important. > > > > RB balancing isn't that complex. Maybe you're thinking of AVL trees? > > The problem with using heaps in network software is you must be careful > adversaries cannot dictate any of the parameters. Certainly when you're Ignore this. I'm confusing heaps with hashes.... From slamb at slamb.org Mon Nov 5 00:37:52 2007 From: slamb at slamb.org (Scott Lamb) Date: Mon Nov 5 00:38:00 2007 Subject: [Libevent-users] announcing libev, towards a faster and more featureful libevent In-Reply-To: <20071105013627.GA763@schmorp.de> References: <20071103201507.GA11441@schmorp.de> <472E6C19.5060502@slamb.org> <20071105013627.GA763@schmorp.de> Message-ID: <472EAC30.8060509@slamb.org> Marc Lehmann wrote: > On Sun, Nov 04, 2007 at 05:04:25PM -0800, Scott Lamb wrote: >>> * multiple watchers can wait for the same event, there is no limitation >>> to one or two watchers for signals and io. >> Could you give me an example of where that is important? > > Mostly in environments using some form garbage collection. For example, > this idiom is common in Perl: > > $event = EV::io $fd, ... > > If $event happens to contain an old watcher and $fd happens to refer to > the same fd as that old watcher, this will lead into probelms as the > watchers are both alive for a short time. > > There is actually a lot of code that relies on this just working, and the > only other event loop I know that has a problem with this is Tk. Ugh, I'd argue that idiom is broken. But if the support's free, I guess it doesn't matter. >>> * there are two types of timers, based on real time differences and wall >>> clock time (cron-like). timers can also be repeating and be reset at >>> almost no cost (for idle timeouts used by many network servers). time jumps >>> get detected reliably in both directions with or without a monotonic clock. >> (See my other mail about Linux's new timerfd facility.) > > (timerfd unfortunately makes little sense for this, as it adds overhead > but I can't see the compelling advantage, as one will still run into the > same time jump problems with periodic timers). > >> Nice; repeating and absolute timers have come up several times before, too. > > This was something I always missed in event loops. That is, some event > loops have one timer type, some the other, but never both. > >>> * timers are managed by a priority queue (O(1) for important operations >>> as opposed to O(log n) in libevent, also resulting in much simpler code). >> In terms of concrete data types, you appear to have used a binary heap? >> So by "important operations" you mean removal, correct? > > removal: O(log n) > insertion: O(log n) > find next: O(1) > >> still O(log n)? The asymptotic behavior is no different, then, as >> insertion happens at least as often as removal. > > Yes, but: > > a) finding the next timwer is a constant time issue > b) a red-black tree is more than three times as slow > > (see the updated benchmark at http://libev.schmorp.de/bench.html, > especially the difference between the first (no timers) and the second > examples (timers in use)) Ahh, very nice benchmarks. > >>> * I added idle watchers, pid watchers and hook watchers into the event loop, >>> as is required for integration of other event-based libraries, without >>> having to force the use of some construct around event_loop. >> Pardon my ignorance, but what are hook watchers? > > if you want to plug-in other event-based libraries into the event loop you > need to get to be able to hook into the event loop. this is what those > watcher types provide. > > the alternative would be to write your own event_loop with EV_NONBLOCK, but > that isn't modular, that is, if you have two difefernt sofwtare modules > having their own event_loop you *must* use, you lose. prepare/check watchers > use this problem nicely. > > A number of event loops have them, and they are useful for other things, > such as transpoarently integrating coroutine packages etc. > > Its not a killer feature, just very very useful in some cases. > >> pid watchers I assume to be a fancy SIGCHLD handler? > > Yes. > >> That's a potentially useful feature, but why would it require a >> construct around event_loop? > > I don't udnerstand that, there is no construct around event_loop, its handled > completely seperately. My question was in response to your "I added idle watchers, pid watchers and hook watchers into the event loop, as is required for integration of other event-based libraries, without having to force the use of some construct around event_loop." > The reason is exists is allowing to share this potentially unsharable > resource. For example, poll and select let you do "everything" (with fds), > but you can of course only have one component per (single-thread) process > using them, as they are blocking. > > The same thing is true for signals: you can't share them with sigaction, as > sigaction only allows one user. > > And the same thing is true for sigchld. Yes, I could see why sharing SIGCHLD would be useful. I was thinking of this when asking above when you want to have multiple watchers for the same event, but this was the only example I could think of off-hand, so it seemed like two features to address the same use case. >> A few notes: >> >> * what is the purpose of EV_COMMON? > > Allowing customised event watchers. If you are concerned, treat it as a an > internal symbol. Its use is documented here: > http://cvs.schmorp.de/libev/README.embed > >> From first glance, I'm concerned that it could not be used properly >> unless libev.so and all callers are compiled with the same flags, which >> seems impractical if the library ever gains wide use. > > This is true, but its an optional feature you don't have to use. In case > you wonder, EV, the perl interface to libev, uses this feature. > > It makes most sense when embedding, of course (not all the world is an .so > :). Hmm, in your Perl example, I wouldn't rule out you wanting to share the event loop with some C-based library and being unable to do so. All the world is an .so. ;) There's definitely nothing you can't do with a void*, so this is all a question of efficiency. I assert that the cost of a sizeof(void*) to point to the relevant part of your structure (which can be nearby...still reasonable cache locality) is not too high. > >> * on ev_once failure, you're calling the callback with EV_ERROR? Yuck. >> That's quite surprising behavior, and I could see it leading to stack >> overflows as each ev_once tries to issue another one. > > All callbacks will be called with EV_ERROR when an error occurs. And yes, > if you don't do error handlign and endlessly retry the same operation in a > loop, you run into problems. > > But as that is an obvious programming bug, I don't see any problem here. Hmm. Let me introduce a use case: an event-driven program which must not fail. init or similar. I worked on such a program recently. If it were unreliable, you would have to send the system back to the factory for repair (i.e., flashing new software). On ENOMEM, it would basically sleep and retry. This was quite successful (memory could be temporarily consumed by network buffers, etc, which cleared itself up after a while). For this program, it's important to know more than that an error has occurred. EV_ERROR is totally inadequate. You're using it for several different cases. I spotted at least these three: * malloc() failed in ev_once - transient runtime error. * select() failed with ENOMEM, so libev chose to kill this file descriptor and now is notifying userspace. * bad file descript