In this post I'll show you the code path that Rust takes inside itsstandard library when you open a file. I wanted to learn how Rusthandles system calls and errno, and all the little subtleties of thePOSIX API. This is what I learned!

Mac Os Versions

Nov 23, 2020 After the first day, I got blown away. I literally coded 8 hours straight like I normally would and without plugging the Laptop in. The battery life is mind-blowing. This is probably the most attractive thing about these new M1 Laptops. My coding time became unusually quiet. I purposely tried many things at once, and it continued to be. Include a Rust library in a Cocoa app: Create a static library in Rust, and bundles/links it with a mac app. The mac app target in Xcode depends on an external target which is the Rust library, and the C header for the Rust library is used to invoke Rust functions from Swift. Run the application by opening the Xcode project and using the Run.

When you open a file, or create a socket, or do anything else thatreturns an object that can be accessed like a file, you get a filedescriptor in the form of an int.

You get a nonnegative integer in case of success, or -1 in case of anerror. If there's an error, you look at errno, which gives you aninteger error code.

Many system calls can return EINTR, which means 'interrupted systemcall', which means that something interrupted the kernel while itwas doing your system call and it returned control to userspace, withthe syscall unfinished. For example, your process may have received aUnix signal (e.g. you send it SIGSTOP by pressing Ctrl-Z on aterminal, or you resized the terminal and your process got aSIGWINCH). Most of the time EINTR means simply that you mustretry the operation: if you Control-Z a program to suspend it, andthen fg to continue it again; and if the program was in the middleof open()ing a file, you would expect it to continue at that exactpoint and to actually open the file. Software that doesn't check forEINTR can fail in very subtle ways!

Once you have an open file descriptor, you can read from it:

... and one has to remember that if read() returns 0, it means wewere at the end-of-file; if it returns less than the number of bytesrequested it means we were close to the end of file; if this is anonblocking socket and it returns EWOULDBLOCK or EAGAIN then onemust decide to retry the operation or actually wait and try againlater.

There is a lot of buggy software written in C that tries to use thePOSIX API directly, and gets these subtleties wrong. Most programswritten in high-level languages use the I/O facilities provided bytheir language, which hopefully make things easier.

Rust makes error handling convenient and safe. If you decide toignore an error, the code looks like it is ignoring the error(e.g. you can grep for unwrap() and find lazy code). Thecode actually looks better if it doesn't ignore the error andproperly propagates it upstream (e.g. you can use the ? shortcut topropagate errors to the calling function).

I keep recommending this article on error models to people; itdiscusses POSIX-like error codes vs. exceptions vs. more modernapproaches like Haskell's and Rust's - definitely worth studying overa few of days (also, see Miguel's valiant effort to move C# I/O awayfrom exceptions for I/O errors).

So, what happens when one opens a file in Rust, from the toplevel APIdown to the system calls? Let's go down the rabbit hole.

You can open a file like this:

This does not give you a raw file descriptor; it gives you anio::Result<fs::File, io::Error>, which you must pick apart to see ifyou actually got back a File that you can operate on, or an error.

Let's look at the implementation of File::open() and File::create().

Here, OpenOptions is an auxiliary struct that implements a 'builder'pattern. Instead of passing bitflags for the variousO_CREATE/O_APPEND/etc. flags from the open(2) system call, onebuilds a struct with the desired options, and finally calls .open()on it.

So, let's look at the implementation of OpenOptions.open():

See that fs_imp::File::open()? That's what we want: it's theplatform-specific wrapper for opening files. Let's lookat its implementation for Unix:

The first line, let path = cstr(path)? tries to convert a Pathinto a nul-terminated C string. The second line calls the following:

Here, let flags = ... converts the OpenOptions we had in thebeginning to an int with bit flags.

Then, it does let fd = cvt_r (LAMBDA), and that lambda functioncalls the actual open64() from libc (a Rust wrapper for the system'slibc): it returns a file descriptor, or -1 on error. Why is thisdone in a lambda? Let's look at cvt_r():

Okay! Here f is the lambda that calls open64(); cvt_r() callsit in a loop and translates the POSIX-like result into somethingfriendly to Rust. This loop is where it handles EINTR, which getstranslated into ErrorKind::Interrupted. I suppose cvt_r() standsfor convert_retry()? Let's look atthe implementation of cvt(), which fetches the error code:

(The IsMinusOne shenanigans are just a Rust-ism to help convertmultiple integer types without a lot of as casts.)

The above means, if the POSIX-like result was -1, return an Err() fromthe last error returned by the operating system. That should surelybe errno internally, correct? Let's look atthe implementation for io::Error::last_os_error():

We don't need to look at Error::from_raw_os_error(); it's just aconversion function from an errno value into a Rust enum value.However, let's look at sys::os::errno():

Here, errno_location() is an extern function defined in GNU libc(or whatever C library your Unix uses). It returns a pointer to theactual int which is the errno thread-local variable. Since non-Ccode can't use libc's global variables directly, there needs to be away to get their addresses via function calls - that's whaterrno_location() is for.

And on Windows?

Remember the internal File.open()? This is what it lookslike on Windows:

CreateFileW() is the Windows API function to open files. Theconversion of error codes inside Error::last_os_error() happensanalogously - it calls GetLastError() from the Windows API andconverts it.

Can we not call C libraries?

Away

The Rust/Unix code above depends on the system's libc for open() anderrno, which are entirely C constructs. Libc is what actually doesthe system calls. There are efforts to make the Rust standard librarynot use libc and use syscalls directly.

As an example, you can look atthe Rust standard library for Redox. Redox is a new operatingsystem kernel entirely written in Rust. Fun times!

Update: If you want to see what a C-less libstd would looklike, take a look at steed, an effort to reimplement Rust's libstdwithout C dependencies.

Rust is very meticulous about error handling, but it succeeds inmaking it pleasant to read. I/O functions give you back anio::Result<>, which you piece apart to see if it succeeded or got anerror.

Internally, and for each platform it supports, the Rust standardlibrary translates errno from libc into an io::ErrorKind Rustenum. The standard library also automatically handles Unix-isms likeretrying operations on EINTR.

I've been enjoying reading the Rust standard library code: ithas taught me many Rust-isms, and it's nice to see how thehairy/historical libc constructs are translated into clean Rustidioms. I hope this little trip down the rabbit hole for theopen(2) system call lets you look in other interesting places, too.

Tokio internals: Understanding Rust's asynchronous I/O framework from the bottom up

Tokio is a Rust framework for developingapplications which perform asynchronous I/O — an event-drivenapproach that can often achieve better scalability, performance, andresource usage than conventional synchronous I/O. Unfortunately, Tokiois notoriously difficult to learn due to its sophisticated abstractions.Even after reading the tutorials, I didn't feel that I had internalizedthe abstractions sufficiently to be able to reason about what wasactually happening.

My prior experience with asynchronous I/O programming may have evenhindered my Tokio education. I'm accustomed to using the operatingsystem's selection facility (e.g. Linux epoll) as a starting point, andthen moving on to dispatch, state machines, and so forth. Starting withthe Tokio abstractions with no clear insight into where and how theunderlying epoll_wait() happens, I found it difficult toconnect all the dots. Tokio and its future-driven approach felt likesomething of a black box.

Instead of continuing on a top-down approach to learning Tokio, Idecided to instead take a bottom-up approach by studying the source codeto understand exactly how the current concrete implementation drives theprogression from epoll events to I/O consumption within aFuture::poll(). I won't go into great detail about thehigh-level usage of Tokio and futures, as that is better covered in theexistingtutorials. I'm also not going to discuss the general problem ofasynchronous I/O beyond a short summary, since entire books could bewritten on the subject. My goal is simply to have some confidence thatfutures and Tokio's polling work the way I expect.

First, some important disclaimers. Note that Tokio is actively beingdeveloped, so some of the observations here may quickly becomeout-of-date. For the purposes of this study, I used tokio-core0.1.10, futures 0.1.17, and mio 0.6.10.Since I wanted to understand Tokio at its lowest levels, I did notconsider higher-level crates like tokio-proto andtokio-service. The tokio-core event system itself has alot of moving pieces, most of which I avoid discussing in the interestof brevity. I studied Tokio on a Linux system, and some of thediscussion necessarily touches on platform-dependent implementationdetails such as epoll. Finally, everything mentioned here is myinterpretation as a newcomer to Tokio, so there could be errors ormisunderstandings.

Asynchronous I/O in a nutshell

Synchronous I/O programming involves performing I/O operations whichblock until completion. Reads will block until data arrives, and writeswill block until the outgoing bytes can be delivered to the kernel.This fits nicely with conventional imperative programming, where aseries of steps are executed one after the other. For example, consideran HTTP server that spawns a new thread for each connection. On thisthread, it may read bytes until an entire request is received (blockingas needed until all bytes arrive), processes the request, and then writethe response (blocking as needed until all bytes are written).This is a very straightforward approach.The downside is that a distinct thread is needed for eachconnection due to the blocking, each with its own stack. In many casesthis is fine, and synchronous I/O is the correct approach. However, thethread overhead hinders scalability on servers trying to handle a verylarge number of connections (see: the C10k problem),and may also be excessive on low-resource systems handling a fewconnections.

If our HTTP server was written to use asynchronous I/O, on the otherhand, it might perform all I/O processing on a single thread. Allactive connections and the listening socket would be configured asnon-blocking, monitored for read/write readiness in an event loop, andexecution would be dispatched to handlers as events occur. State andbuffers would need to be maintained for each connection. If a handleris only able to read 100 bytes of a 200-byte request, it cannot wait forthe remaining bytes to arrive, since doing so would prevent otherconnections from making progress. It must instead store the partialread in a buffer, keep the state set to 'reading request', and return tothe event loop. The next time the handler is called for thisconnection, it may read the remainder of the request and transition to a'writing response' state. Implementing such a system can become hairyvery fast, with complex state machines and error-prone resourcemanagement.

The ideal asynchronous I/O framework would provide a means of writingsuch I/O processing steps one after the other, as if they were blocking,but behind the scenes generate an event loop and state machines. That'sa tough goal in most languages, but Tokio brings us pretty close.

The Tokio stack

The Tokio stack consists of the following components:

  1. The system selector.Each operating system provides a facility for receiving I/O events, suchas epoll (Linux), kqueue() (FreeBSD/Mac OS), or IOCP(Windows).
  2. Mio - Metal I/O.Mio is a Rust crate thatprovides a common API for low-level I/O by internally handling thespecific details for each operating system. Mio deals with thespecifics of each operating system's selector so you don't have to.
  3. Futures.Futures provide apowerful abstraction for representing things that have yet to happen.These representations can be combined in useful ways to create compositefutures describing a complex sequence of events. This abstraction isgeneral enough to be used for many things besides I/O, but in Tokio wedevelop our asynchronous I/O state machines as futures.
  4. TokioThe tokio-corecrate provides the central event loop which integrates with Mio torespond to I/O events, and drives futures to completion.
  5. Your program.A program using the Tokio framework can construct asynchronous I/Osystems as futures, and provide them to the Tokio event loop forexecution.

Rust Away Mac Os Update

Mio: Metal I/O

Mio provides a low-level I/O API allowing callers to receive events suchas socket read/write readiness changes. The highlights are:

  1. Poll and Evented.Mio supplies the Evented trait to represent anything that can be a source ofevents. In your event loop, you register a number ofEvented's with a mio::Poll object, then call mio::Poll::poll() to block until events have occurred on oneor more Evented objects (or the specified timeout haselapsed).
  2. System selector.Mio provides cross-platform access to the system selector, so that Linuxepoll, Windows IOCP, FreeBSD/Mac OS kqueue(), andpotentially others can all be used with the same API. The overheadrequired to adapt the system selector to the Mio API varies. BecauseMio provides a readiness-based API similar to Linux epoll, many parts ofthe API can be one-to-one mappings when using Mio on Linux. (Forexample, mio::Events essentially is an array ofstruct epoll_event.) In contrast, because Windows IOCP iscompletion-based instead of readiness-based, a bit more adaptation isrequired to bridge the two paradigms. Mio supplies its own versions ofstd::net structs such as TcpListener,TcpStream, and UdpSocket. These wrap thestd::net versions, but default to non-blocking and provideEvented implementations which add the socket to the systemselector.
  3. Non-system events.In addition to providing readiness of I/O sources, Mio can also indicatereadiness events generated in user-space. For example, if a workerthread finishes a unit of work, it can signal completion to the eventloop thread. Your program calls Registration::new2() to obtain a (Registration,SetReadiness) pair. The Registration object is anEvented which can be registered with Mio in your eventloop, and set_readiness() can be called on theSetReadiness object whenever readiness needs to beindicated. On Linux, non-system event notifications are implementedusing a pipe. When SetReadiness::set_readiness() iscalled, a 0x01 byte is written to the pipe.mio::Poll's underlying epoll is configured to monitor thereading end of the pipe, so epoll_wait() will unblock andMio can deliver the event to the caller. Exactly one pipe is createdwhen Poll is instantiated, regardless of how many (if any)non-system events are later registered.

Every Evented registration is associated with acaller-provided usize value typed as mio::Token, and this value is returned with events toindicate the corresponding registration. This maps nicely to the systemselector in the Linux case, since the token can be placed in the 64-bitepoll_data union which functions in the same way.

To provide a concrete example of Mio operation, here's what happensinternally when we use Mio to monitor a UDP socket on a Linux system:

  1. Create the socket.

    This creates a Linux UDP socket, wrapped in astd::net::UdpSocket, which itself is wrapped in amio::net::UdpSocket. The socket is set to be non-blocking.

  2. Create the poll.

    Mio initializes the system selector, readiness queue (for non-systemevents), and concurrency protection. The readiness queue initializationcreates a pipe so readiness can be signaled from user-space, and thepipe's read file descriptor is added to the epoll. When aPoll object is created, it is assigned a uniqueselector_id from an incrementing counter.

  3. Register the socket with the poll.

    The UdpSocket's Evented.register() function iscalled, which proxies to a contained EventedFd which addsthe socket's file descriptor to the poll selector (by ultimately usingepoll_ctl(fepd, EPOLL_CTL_ADD, fd, &epoll_event) whereepoll_event.data is set to the provided token value). Whena UdpSocket is registered, its selector_id isset to the Poll's, thus associating it with the selector.

  4. Call poll() in an event loop.

    The system selector (epoll_wait()) and then the readinessqueue are polled for new events. (The epoll_wait() blocks,but because non-system events trigger epoll via the pipe in addition topushing to the readiness queue, they will still be processed in a timelymanner.) The combined set of events are made available to the callerfor processing.

Rust Away Mac Os Download

Futures and Tasks

Futuresare techniques borrowed from functional programming whereby computationthat has yet to happen can be represented as a 'future', and theseindividual futures can be combined to develop complex systems. This isuseful for asynchronous I/O because the basic steps needed to performtransactions can be modeled as such combined futures. In the HTTPserver example, one future may read a request by reading bytes as theybecome available until the end of the request is reached, at which timea 'Request' object is yielded. Another future may process a request andyield a response, and yet another future may write responses.

In Rust, futures are implemented in the futures crate. Youcan define a future by implementing the Future trait, which requires a poll() method which is called as needed to allow the futureto make progress. This method returns either an error, an indication that thefuture is still pending thus poll() should be called againlater, or a yielded value if the future has reached completion. TheFuture trait also provides a great many combinators asdefault methods.

To understand futures, it is crucial to understand tasks, executors, andnotifications — and how they arrange for a future'spoll() method to be called at the right time. Every futureis executed within a task context. A task itself is directly associated withexactly one future, but this future may be a composite future thatdrives many contained futures. (For example, multiple futures joinedinto a single future using the join_all() combinator, or two futures executed in seriesusing the and_then() combinator.)

Tasks and their futures require an executor to run. Anexecutor is responsible for polling the task/future at the correct times— usually when it has been notified that progress can be made.Such a notification happens when some other code calls the notify() method of the provided object implementing thefutures::executor::Notify trait. An example of this can beseen in the extremely simple executor provided by thefutures crate that is invoked when calling the wait() method on a future. From the source code:

Given a futures::executor::Spawn object previously created to fuse atask and future, this executor calls poll_future_notify() in a loop. The providedNotify object becomes part of the task context and thefuture is polled. If a future's poll() returnsAsync::NotReady indicating that the future is stillpending, it must arrange to be polled again in the future. Itcan obtain a handle to its task via futures::task::current() and call the notify() method whenever the future can again make progress.(Whenever a future is being polled, information about its associatedtask is stored in a thread-local which can be accessed viacurrent().) In the above case, if the poll returnsAsync::NotReady, the executor will block until thenotification is received. Perhaps the future starts some work onanother thread which will call notify() upon completion, orperhaps the poll() itself calls notify()directly before returning Async::NotReady. (The latter isnot common, since theoretically a poll() should continuemaking progress, if possible, before returning.)

The Tokio event loop acts as a much more sophisticated executor thatintegrates with Mio events to drive futures to completion. In thiscase, a Mio event indicating socket readiness will result in anotification that causes the corresponding future to be polled.

Tasks are the basic unit of execution when dealing with futures, and areessentially greenthreads providing a sort of cooperativemultitasking, allowing multiple execution contexts on one operatingsystem thread. When one task is unable to make progress, it will yieldthe processor to other runnable tasks. It is important to understandthat notifications happen at the task level and not the future level.When a task is notified, it will poll its top-level future, which mayresult in any or all of the child futures (if present) being polled.For example, if a task's top-level future is a join_all() of ten other futures, and one of these futuresarranges for the task to be notified, all ten futures will be polledwhether they need it or not.

Tokio's interface with Mio

Tokio converts task notifications into Mio events by using Mio's'non-system events' feature described above. After obtaining a Mio(Registration, SetReadiness) pair for the task, itregisters the Registration (which is anEvented) with Mio's poll, then wraps theSetReadiness object in a MySetReadiness whichimplements the Notify trait. From the source code:

In this way, task notifications are converted into Mio events, and canbe processed in Tokio's event handling and dispatch code along withother types of Mio events.

Just as Mio wraps std::net structs such asUdpSocket, TcpListener, andTcpStream to customize functionality, Tokio also usescomposition and decoration to provide Tokio-aware versions of thesetypes. For example, Tokio's UdpSocket looks something likethis:

Tokio's versions of these I/O source types provide constructors thatrequire a handle to the event loop (tokio_core::reactor::Handle). When instantiated, thesetypes will register their sockets with the event loop's Mio poll toreceive edge-triggered events with a newly assigned even-numbered token.(More on this, below.) Conveniently, these types will also arrange forthe current task to be notified of read/write readiness whenever theunderlying I/O operation returns WouldBlock.

Tokio registers several types of Evented's with Mio, keyedto specific tokens:

  • Token 0 (TOKEN_MESSAGES) is used for Tokio's internalmessage queue, which provides a means of removing I/O sources,scheduling tasks to receive read/write readiness notifications,configuring timeouts, and running arbitrary closures in the context ofthe event loop. This can be used to safely communicate with the eventloop from other threads. For example, Remote::spawn() marshals the future to the event loop viathe message system.

    The message queue is implemented as a futures::sync::mpsc stream. As a futures::stream::Stream (which is similar to a future,except it yields a sequence of values instead of a single value), theprocessing of this message queue is performed using theMySetReadiness scheme mentioned above, where theRegistration is registered with theTOKEN_MESSAGES token. When TOKEN_MESSAGESevents are received, they are dispatched to theconsume_queue() method for processing. (Source: enum Message, consume_queue())

  • Token 1 (TOKEN_FUTURE) is used to notify Tokio that themain task needs to be polled. This happens when a notification occurswhich is associated with the main task. (In otherwords, the future passed to Core::run() or a child thereof,not a future running in a different task via spawn().) Thisalso uses a MySetReadiness scheme to translate futurenotifications into Mio events. Before a future running in the main taskreturns Async::NotReady, it will arrange for a notificationto be sent later in a manner of its choosing. When the resultingTOKEN_FUTURE event is received, the Tokio event loop willre-poll the main task.

  • Even-numbered tokens greater than 1 (TOKEN_START+key*2) areused to indicate readiness changes on I/O sources. The key is theSlab key for the associated Core::inner::io_dispatchSlab<ScheduledIo> element. The Mio I/O source types(UdpSocket, TcpListener, andTcpStream) are registered with such a token automaticallywhen the corresponding Tokio I/O source types are instantiated.

  • Odd-numbered tokens greater than 1 (TOKEN_START+key*2+1)are used to indicate that a spawned task (and thus its associatedfuture) should be polled. The key is the Slab key for theassociated Core::inner::task_dispatchSlab<ScheduledTask> element. As withTOKEN_MESSAGES and TOKEN_FUTURE events, thesealso use the MySetReadiness plumbing.

Tokio event loop

Tokio, specifically tokio_core::reactor::Core, provides the event loop to managefutures and tasks, drive futures to completion, and interface with Mioso that I/O events will result in the correct tasks being notified.Using the event loop involves instantiating the Core withCore::new() and calling Core::run() with a single future. The event loop will drivethe provided future to completion before returning. For serverapplications, this future is likely to be long-lived. It may, forexample, use a TcpListener to continuously accept newincoming connections, each of which may be handled by their own futurerunning independently in a separate task created by Handle.spawn().

The following flow chart outlines the basic steps of the Tokio eventloop:

What happens when data arrives on a socket?

A useful exercise for understanding Tokio is to examine the steps thatoccur within the event loop when data arrives on a socket. I wassurprised to discover that this ends up being a two-part process, witheach part requiring a separate epoll transaction in aseparate iteration of the event loop. The first part responds to asocket becoming read-ready (i.e., a Mio event with an even-numberedtoken greater than one for spawned tasks, or TOKEN_FUTUREfor the main task) by sending a notification to the task which isinterested in the socket. The second part handles the notification(i.e., a Mio event with an odd-numbered token greater than one) bypolling the task and its associated future. We'll consider the steps ina scenario where a spawned future is reading from aUdpSocket on a Linux system, from the top of the Tokioevent loop, assuming that a previous poll of the future resulted in arecv_from() returning a WouldBlock error.

The Tokio event loop calls mio::Poll::poll(), which in turn(on Linux) calls epoll_wait(), which blocks until somereadiness change event occurs on one of the monitored file descriptors.When this happens, epoll_wait() returns an array ofepoll_event structs describing what has occurred, which aretranslated by Mio into mio::Events and returned to Tokio.(On Linux, this translation should be zero-cost, sincemio::Events is just a single-tuple struct of aepoll_event array.) In our case, assume the only event inthe array is indicating read readiness on the socket. Because the eventtoken is even and greater than one, Tokio interprets this as an I/Oevent, and looks up the details in the corresponding element ofSlab<ScheduledIo>, which contains information on anytasks interested in read and write readiness for this socket. Tokiothen notifies the reader task which, by way of theMySetReadiness glue described earlier, calls Mio'sset_readiness(). Mio handles this non-system event byadding the event details to its readiness queue, and writing a single0x01 byte to the readiness pipe.

After the Tokio event loop moves to the next iteration, it once againpolls Mio, which calls epoll_wait(), which this timereturns a read readiness event occurring on Mio's readiness pipe. Mioreads the 0x01 which was previously written to the pipe,dequeues the non-system event details from the readiness queue, andreturns the event to Tokio. Because the event token is odd and greaterthan one, Tokio interprets this as a task notification event, and looksup the details in the corresponding element ofSlab<ScheduledTask>, which contains the task'soriginal Spawn object returned from spawn().Tokio polls the task and its future via poll_future_notify(). The future may then read data fromthe socket until it gets a WouldBlock error.

This two-iteration approach involving a pipe write and read may add a littleoverhead when compared to other asynchronous I/O event loops. Ina single-threaded program, it is weird to look at thestrace and see a thread use a pipe to communicate withitself:

Mio uses this pipe scheme to support the general case whereset_readiness() may be called from other threads, andperhaps it also has some benefits in forcing the fair scheduling ofevents and maintaining a layer of indirection between futures and I/O.

Lessons learned: Combining futures vs. spawning futures

When I first started exploring Tokio, I wrote a small program to listenfor incoming data on several different UDP sockets. I created teninstances of a socket-reading future, each of them listening on adifferent port number. I naively joined them all into a single futurewith join_all(), passed the combined future toCore::run(), and was surprised to discover that everyfuture was being polled whenever a single packet arrived. Also somewhatsurprising was that tokio_core::net::UdpSocket::recv_from()(and its underlying PollEvented) was smart enough to avoid actually calling theoperating system's recvfrom() on sockets that had not beenflagged as read-ready in a prior Mio poll. The strace,reflecting a debug println!() in my future'spoll(), looked something like this:

Since the concrete internal workings of Tokio and futures were somewhatopaque to me, I suppose I hoped there was some magic routing happeningbehind the scenes that would only poll the required futures. Of course,armed with a better understanding of Tokio, it's obvious that my programwas using futures like this:

This actually works fine, but is not optimal — especially if you have alot of sockets. Because notifications happen at the task level, anynotification arranged in any of the green boxes above will cause themain task to be notified. It will poll its FromAll future,which itself will poll each of its children. What I really need is asimple main future that uses Handle::spawn() to launch eachof the other futures in their own tasks, resulting in an arrangementlike this:

When any future arranges a notification, it will cause only the future'sspecific task to be notified, and only that future will be polled.(Recall that 'arranging a notification' happens automatically whentokio_core::net::UdpSocket::recv_from() receivesWouldBlock from its underlyingmio::net::UdpSocket::recv_from() call.) Future combinatorsare powerful tools for describing protocol flow that would otherwise beimplemented in hand-rolled state machines, but it's important tounderstand where your design may need to support separate tasks thatcan make progress independently and concurrently.

Final thoughts

Studying the source code of Tokio, Mio, and futures has really helpedsolidify my comprehension of Tokio, and validates my strategy ofclarifying abstractions through the understanding of their concreteimplementations. This approach could pose a danger of only learningnarrow use cases for the abstractions, so we must consciously considerthe concretes as only being examples that shed light on the generalcases. Reading the Tokio tutorials after studying the source code, Ifind myself with a bit of a hindsight bias: Tokio makes sense, andshould have been easy to understand to begin with!

I still have a few remaining questions that I'll have to research someother day:

  • Does Tokio deal with the starvation problem of edge triggering? Isuppose it could be handled within the future by limiting the number ofread/writes in a single poll(). When the limit is reached,the future could return early after explicitly notifying the currenttask instead of relying on the implicit'schedule-on-WouldBlock' behavior of the Tokio I/O sourcetypes, thus allowing other tasks and futures a chance to make progress.
  • Does Tokio support any way of running the event loop itself on multiplethreads, instead of relying on finding opportunities to offload work toworker threads to maximize use of processor cores?

UPDATE 2017-12-19: There is aReddit thread on r/rust discussing this post. Carl Lerche, authorof Mio, has posted some informative commentshere andhere. In addition to addressing the above questions, he notes that FuturesUnordered provides a means of combining futures suchthat only the relevant child futures will be polled, thus avoidingpolling every future as join_all() would, with the tradeoff of additionalallocations.Also, a future version of Tokio will be migrating away from themio::Registration scheme for notifying tasks, which couldstreamline some of the steps described earlier.

UPDATE 2017-12-21: It looks like Hacker News also had adiscussion of this post.

UPDATE 2018-01-26: I created aGitHub repositoryfor my Tokio example code.

posted at 2017-12-18 06:35:38 US/Mountain by David Simmons
tags: rust tokio io
permalinkcomments