Around a year ago I've explained in this article how to implement in C an interactive graphics display using GTK, multiprocess and a shared memory segment. In the present article I'll introduce another solution: X11 and multithread.
The X Window System (aka X11, official webpage here) is a low level windowing system primarily developped for Unix-like OSes and later ported to other OSes. It's very basic in the features it offers: create and manage windows on a display, draw some primitive 2D shapes, and interact with the mouse and keyboard; but also super versatile as it is architecture-independent and available in all *nix systems. It's now progressively replaced by its successor, Wayland, which is becoming the default for more and more systems since the late 2010s. The official webpage itself recommends to use GTK (cf my previous article), or for low-level development XCB. However, given that it has been the standard since several decades and is still the backbone of tons of code (see this page for an old but significative list of libraries and toolkits based on X11), it's not going to disappear anytime soon. Knowing about it is surely profitable in a way or another.
It's true that if you don't like to "do it yourself" and dive down to bit level X11 is not for you. In the other hand it is certainly not what some describe as a completely impractical monster only tamed by voodoo rituals inscribed in its arcane documentation. As I'll show below, less than 300 lines of code is all you need for multithread, triple RGB buffer, resizable window, buffered mouse/key events, clock, refresh/frame rate, all packed in a super easy to use structure and its dozen of functions. The only rebuke I would agree with is the mesmerizing absence of a complete workable example on the web. There are tons of code snippets here and there, but during my two days of X11 study to write this article I could never find something similar to what I provide below. All the code in this article is available for download (cf below), it will surely be helpful to someone else.
Enough chatting, let's see how it works. The first thing is to open a connection to the X server of a display device. The X Window System was developped to work remotely over a network for historical reason. Even if you want to open a window on your local machine, it uses a client-server model. However that doesn't mean you have to bother about the network layer, everything is done in one single line: Display* display = XOpenDisplay(displayName);. displayName is the network name of the device on which you want to display the window. On the first screen of the local machine it's as simple as :0.0, and you don't even need to specify it with the functions I provide, it is the default value. If you have several screens or want to display the window on a remote machine you can select it instead with hostname:server_number.screen_number, where hostname is the IP (or a name that resolves to it via DNS) of the machine, the server number is the ID of the X server on that machine (should be most often 0), and the screen number is the ID of the screen if several ones are connected to the machine (also most often 0 will do). Aaah the good ol' days at university when we discovered we could display at will "funny" pictures on classmates' terminal... :-)
Next step is to create the actual window. Windows are organised in a tree structure for management, so a new window's parent must be specfied. For a single window use-case there is no need to worry and a default one is fine: Window parentWindow = XDefaultRootWindow(display);. An initial color must be given to the new window's content. Given that the content is supposed to be rendered and updated later a default black will do: unsigned long black = XBlackPixel(display, 0). There are tons of other parameters if you want fine control, but default values are already enough for something functional. A window of given initial width and height can be created with Window window = XCreateSimpleWindow(display, parentWindow, 0, 0, width, height, 0, black, black);. Bonus, the window's title can be set with XStoreName(display, window, title);.
For the window to be interactive it must listen to events from the user. This is not the case by default, for performance reason. X11 being a client-server model, events occur on the X server side and are sent to the client side for processing. To avoid the transmission of useless events, one must specify those who are relevant. In the code I provide, window resize, mouse and keyboard events are selected with long evtMask = ExposureMask | StructureNotifyMask | KeyPressMask | KeyReleaseMask | ButtonPressMask | ButtonReleaseMask | PointerMotionMask; XSelectInput(display, window, evtMask);.
For performance reason too, the code I provide uses a separate thread to manage these events and the update of the window's graphical content. Upon reception of events from the server, the thread stores them in a queue until the main thread processes them. The reception of events from the server is done by, first checking if there are waiting events: XEvent event; bool waitingEvent = XCheckWindowEvent(display, window, evtMask, &event);, then if there was an event waiting, process it: if(waitingEvent) switch(event.type) { ... }.
There are three types of mouse events: when the pointer moves over the window (case MotionNotify:), and when a mouse's button is pressed or released (case ButtonPress: and case ButtonReleased:). In each case the event received with XCheckWindowEvent() is cast into the appropriate structure to get access to the event information: XMotionEvent* motionEvent = (XMotionEvent*)&event; (and XButtonEvent for press and release). Event's information includes the button ID, the state (shift/ctrl/...), the time, the location (cf the doc for details): motionEvent->x, motionEvent->y, etc... It's rather straightforward to use.
There are two types of keyboard events: when a key is pressed or released (case KeyPress: and case KeyReleased:). As for the button events, casting gives access to the event's information: XKeyEvent* keyEvent = (XKeyEvent*)&event;. However, to abstract keyboard mapping the returned event gives the pressed key as a keycode. It must be converted using unsigned long keysym = XLookupKeysym(keyEvent, 0); to get the actual useful keysym value. That value can be compared to predefined values XK_a, XK_b, ... XK_space, XK_Escape, ... (cf /usr/include/X11/keysymdef.h), which conveniently map to ASCII values (so if(keysym == XK_a) is the same as if(keysym == 'a')). Note, this is the "simple" version, refer to this gist instead if you're using an IME.
As for other events the resize event (case ConfigureNotify:) is cast (XConfigureEvent* configureEvent = (XConfigureEvent*)&event;) to get access to the window new width and height (configureEvent->width and configureEvent->height). These dimensions should be memorised and the content of the window should be rendered/updated accordingly (more on this later). The "philosophy" of X11 is to let the server side control the dimensions, and to obey them on the client side. Finally the exposure event (case Expose:) is used by the server to explicitly require the update of the window's content (normally the client decides freely when it wants to update it).
Now the interesting part: the window's content drawing. Contrary to the solution of my previous article I wanted this time to try something more responsive, able to display smooth animations (as long as the rendering part is fast enough). Thus, I've choosen to implement a triple buffer solution. It works as follow. The graphical content of the window is memorised as RGB values in an array unsigned char data[height * width * 3]. X11 allows for virtually any arrangement except that data must be stored by row. Three copies of that array are held in memory and used as a circular buffer. Two indices, \(i_d\) and \(i_r\), memorise the array corresponding to the currently displayed content and currently rendered content. The third array is a buffer which allows to solve synchronisation problems between the thread displaying the content and the thread rendering its content. The two following simple rules allow for real time rendering without the two threads ever interfering with or waiting for each other. 1) At the beginning of every display step, if \(((i_d+1)\bmod 3)\ne i_r\) then \(i_d=(i_d+1)\bmod 3\). 2) At the end of every rendering step (publication step), if \(((i_r+1)\bmod 3)\ne i_d\) then \(i_r=(i_r+1)\bmod 3\).
Note that as we allow for window resizing, the dimension of the array may vary at any time. It is then presented to the rendering algorithm in a structure including a pointer to the array itself and its actual dimensions. So, at each rendering step, array \(i_r\) is resized (realloc) if necessary, then the array plus dimensions is passed to the rendering algorithm and updated, then \(i_r\) is updated. The displaying thread always displays the content accordingly to the \(i_d\) (array + dimensions) pair. It means there may be discrepancy between the dimensions and the actual window dimensions at the time of display. Fortunately, X11 is sturdy enough to bear with that, clipping or filling as necessary until the following rendering pass catches up with the window new size.
The displaying thread sends the windows grahical content to the server using a XImage structure, encapsulating all the info necessary for the server to actually update the window. Except for the array values and dimensions that structure never changes, so it's a good idea to create it once at the beginning, update as necessary and reuse it. In the code below, I prepare that structure as follow: XImage image = {.xoffset = 0, .format = ZPixmap, .byte_order = MSBFirst, .bitmap_pad = 32, .depth = 24, .bits_per_pixel = 24, .red_mask = 0xff0000, .green_mask = 0x00ff00, .blue_mask = 0x0000ff, .obdata = NULL, .f = {0}}; XInitImage(&image); . All those values encode the format of the array as I've described it above. When it's time to send the image to the server, image is updated as follow: image.width = array[displayed].width;, image.height = array[displayed].height;, image.data = (char*)(array[displayed].data);, image.bytes_per_line = 3 * array[displayed].width;. And it is sent to the server as follow: XPutImage(display, window, gc, &image, 0, 0, 0, 0, array[displayed].width, array[displayed].height);, where gc is the graphical context which doesn't really matter in that context so the default one does the job: GC gc = XDefaultGC(that->display, 0);.
And that's it, that's already enough to have a resizable window responding to mouse and keyboard and ready to display graphics in real time. I've added a measure of frame rate (number of frame rendered per second) and refresh rate (number of frame sent to the X server per second), hidden the details and packed everything in a little structure and its functions for convenience. The example code below shows how to use it:
compile with the following Makefile:
Edit on 2023/07/24: added -D_POSIX_C_SOURCE=199309L needed to define clock related functions and macros, and correct a bug in X11DisplayGetBuffer().
The header file x11display.h is as follow:
And the body x11display.c is as follow:
Download the code here or with the command wget https://baillehachepascal.dev/2022/Data/X11/x11display.tar.gz, and extract it with tar xvf x11display.tar.gz.
Edit on 2023/07/28: I've modified the tarball such as it extract in a subdirectory 'X11Display' instead of the current directory.