planet.freedesktop.org
May 04, 2015

A year passed since the first release of The Hacker's Guide to Python in March 2014. A few hundreds copies have been distributed so far, and the feedback is wonderful!

I already wrote extensively about the making of that book last year, and I cannot emphasize enough how this adventure has been amazing so far. That's why I decided a few months ago to update the guide and add some new content.

So let's talk about what's new in this second edition of the book!

First, I obviously fixed a few things. I had some reports about small mistakes and typos which I applied as I received them. Not a lot fortunately, but it's still better to have fewer errors in a book, right?

Then, I updated some of the content. Things changed since I wrote the first chapters of that guide 18 months ago. Therefore I had to rewrite some of the sections and take into account new software or libraries that were released.

At last, I decided to enhance the book with one more interview. I've requested my fellow OpenStack developer Joshua Harlow, who is leading a few interesting Python projects, to join the long list of interviewees in the book. I hope you'll enjoy it!

If you didn't get the book yet, go check it out and use the coupon THGTP2LAUNCH to get 20% off during the next 48 hours!

April 23, 2015

I just wanted to say that you to everyone who commented and provided Feedback on my last blog entry where I asked for feedback on what would make you switch to Fedora Workstation. We will take all that feedback and incorporate it into our Fedora Workstation 23 planning. So a big thanks to all of you!

You probably saw this phoronix article which references the log of the #dri-devel channel on freenode. This was an attempt to trash my work on lima and tamil, using my inability to get much in the way of code done, and my unwillingness to throw hackish tamil code over the hedge, against me. Let me take some time to explain some of that from my point of view.

Yes, I have lost traction.

Yes, i am not too motivated to work on cleaning up code at this time. I haven't been too motivated for anything much since FOSDEM 2014. I still have my sunxi kms code to fix up/clean up, and the same is true for the lima driver for mesa. It is also pretty telling that i started to write this blog entry more than two months ago and only now managed to post it.

Under the circumstances, though, everyone else would've given up about 2 to 3 years ago.

Due to my usual combination of stubbornness and actually sitting down and doing the work, I have personally kickstarted the whole open ARM GPU movement. This is the third enormous paradigm shift in Linux graphics that i have caused these past almost 12 years (after modesetting, and pushing ATI to a point of no return with an open source driver). All open ARM GPU projects were at least inspired by this, some actually use some lima code, others would simply not have happened if I hadn't done Lima and/or had kept my mouth shut at important junctions. This was not without sacrifices. Quite the opposite.

From March 2011 on, I have spent an insane amount of time on this. Codethink paid, all-in-all, 6 weeks of my time when i was between customer projects. Some of the very early work was done on Nokia time, as, thanks to our good friend Stephen Elop, operations were winding down severely in late 2011 to the point where mostly constant availability was needed. However, I could've used that extra free time in many other useful ways. When I got a job offer at SuSE in november 2011 (basically getting my old job back due to Matthias Hopf taking up a professorship), I turned it down so I could work on the highly important Lima work instead.

When Codethink ran into a cashflow issue in Oktober 2012 (which apparently was resolved quite successfully, as codethink has grown a lot more since then), I was shown the door. This wasn't too unreasonable a decision given my increasing disappointment with how lima was being treated, the customer projects i was being sent on, and the assumption that a high profile developer like me would have an easy time getting another employer. During the phonecall announcing my dismissal, I did however get told that ARM had already been informed about my dismissal, so that business relations between ARM and codethink could be resumed. I rather doubt that that is standard procedure for dismissing people.

The time after this has been pretty meagre. If you expected at least one of linaro, canonical, mozilla or jolla to support lima, you'd be pretty wrong. Similarly, i have not seen any credible attempt to support lima driver development on a consulting basis. Then there is that fact that applying to companies which use mali and which have no interest in using an open source driver for mali is completely out, as that would mean working on the ARM binary driver, which in turn would mean that i would no longer be allowed to work on lima. All other companies have not gotten with the times with respect to hiring policies, this ranges from demanding that people relocate to interviewing solely about OpenGL spec corners for supposed driver development positions. There was one case in this whole time which had a proper first level interview, and where everything seemed to be working out, only to go dead without a proper explanation after a while. I have since learned from two independent sources that my hiring was stopped due to politics involving ARM. Lima has done wonders making me totally unhirable.

In May 2013 i wrote another proposal to ARM for an open source strategy for Mali (the first was done in Q2 2011 as part of codethink), hoping that in the meantime sentiments had shifted enough and that the absence of an in-between company would make the difference this time round. It got the unofficial backing of some people inside ARM, but they just ended up getting their heads chewed off, and I hope that they didn't suffer too much for trying to do the right thing. The speed with which this proposal was rejected by Jem Davies (ARM MPD VP of Technology), and the undertone of the email reply he sent me, suggests that his mind was made up beforehand and that he wasn't too happy about my actions this time either. This is quite amazing, as one would expect anyone at the VP level of a major company to at least keep his options open. Between the way in which this proposal was rejected and the political pressure ARM seems to apply against Lima, Mr Davies his claims that ARM simply has no customer demand and no budget, and nothing more, are very hard to believe.

So after the polite applause at the end of my FOSDEM2014 talks, it hit me. That applause was the only support i was going to ever get for my work, and i was just going to continue to hit walls everywhere, both with lima and in life. And it would take another year until the next applause, if at all. To be absolutely honest, given the way the modesetting and ATI stories worked out, i should be used to that by now, however hard it still is. But this time round i had been (and of course still am) living with my fantastic and amazingly understanding girlfriend for several years, and she ended up supporting me through Q4 2013 and most of 2014 when my unemployment benefit ran out. This time round, my trailblazing wasn't backfiring on me alone, i was dragging someone else down with me, someone who deserves much better.

I ended up being barely able to write code throughout most of 2014, and focused entirely on linux-sunxi. The best way to block things out was rewriting the linux-sunxi wiki, and that wiki is now lightyears ahead of everything else out there, and an example for everyone else (like linux-exynos and linux-rockchip) for years to come. The few code changes i did make were always tiny and trivial. In august Siarhei Siamashka convinced me that a small display driver for sunxi for u-boot would be a good quick project, and i hoped that it would wet my appetite for proper code again. Sadly though, the fact that sunxi doesn't hide all chip details like the Raspberry Pi does (this was the target device for simplefb) meant that simplefb had to be (generically) extended, and this proved once and for all that simplefb is a total misnomer and that people had been kidding themselves all along (i seem to always run into things like this). The resulting showdown had about 1.5 times the number of emails as simplefb has lines of C code. Needless to say, this did not help my mood either.

In June 2014, a very good friend of mine, who has been a Qt consultant for ages, convinced me to try consulting again. It took until October before something suited came along. This project was pretty hopeless from the start, but it reinstated my faith in my problem solving skills and how valuable those really could be, both for a project, and, for a change, for myself. While i wasn't able to afford traveling to Bordeaux for XDC2014, i was able to afford FOSDEM2015 and I am paying for my own upkeep for the first time in a few years. Finally replacing my incredibly worn and battered probook is still out of the question though.

When i was putting together the FOSDEM Graphics DevRoom schedule around christmas, I noticed that there was a inexcuseably low amount of talks, and my fingers became itchy again. The consulting job had boosted morale sufficiently to do some code again and showing off some initial renders on Mali T series seemed doable within the 5 or so weeks left. It gave the FOSDEM graphics devroom another nice scoop, and filled in one of the many many gaps in the schedule. This time round i didn't kid myself that it would change the world or help boost my profile or anything. This time round, i was just going to prove to myself that i have learned a lot since doing Lima, and that i can do these sort of things still, with ease, while having fun doing so and not exerting myself too much. And that's exactly what I did, nothing more and nothing less.

The vultures are circling.

When Egbert Eich and I were brainstorming about freeing ATI back in April and May of 2007, I stated that I feared that certain "community" members would not accept me doing something this big, and I named 3 individuals by name. While I was initially alone with that view, we did decide that, in light of the shitthrowing contest around XGL and compiz, doing everything right for RadeonHD was absolutely paramount. This is why we did everything in the open, at least from the moment AMD allowed us to do so. Open docs, irc channel, public ml, as short a turn-around on patches as possible given our quality standards and the fact that ATI was not helping at all. I did have a really tough uphill battle convincing everyone that doing native C code was the only way, as, even with all the technical arguments for native code, I knew that my own reputation was at stake. I knew that if I had accepted AtomBIOS for basic modesetting, after only just having convinced the world that BIOS free modesetting was the only way forward, I would've personally been nailed to the cross by those same people.

From the start we knew that ATI was not going to go down without a fight on this one, and I feared that those "community" members would do many things to try to damage this project, but we did not anticipate AMD losing political control over ATI internally, nor did we anticipate the lack of scruples of those individuals. AMD needed this open source driver to be good, as the world hated fglrx to the extent that the ATI reputation was dragging AMDs chips and server market down, and AMD was initially very happy with our work on RadeonHD. Yet to our amazement, some ATI employees sided with those same "community" members, and together they worked on a fork of RadeonHD, but this time doing everything the ATI way, like fglrx.

The amount of misinformation and sometimes even downright slander in this whole story was amazing. Those individuals were actively working on remarketing RadeonHD as an evil Microsoft/Novell conspiracy and while the internet previously was shouting "death to fglrx", it ended up shouting "death to radeonhd". Everyone only seemed to want to join into the shouting, and nobody took a step back to analyze what was really going on. This was ATI & red hat vs AMD & SuSE, and it had absolutely nothing to do with creating the best possible open source driver for ATI hardware.

During all of this I was being internally reminded that SuSE is the good guy and that it doesn't stoop down to the level of the shitthrowers. Being who I am, I didn't always manage to restrain myself, but even today I feel that if I had been more vocal, i could've warded off some of the most inane nonsense that was spewed. When Novell clipped SuSEs wings even further after FOSDEM2009, i was up for the chop too, and I was relieved as i finally could get some distance between myself and this clusterfuck (that once was a free software dream). I also knew that my reputation was in tatters, and that if i had been more at liberty to debunk things, the damage there could've been limited.

The vultures had won, and no-one had called them out for their insidious and shameless behaviour. In fact, they felt empowered, free to use whatever nasty and underhand tactics against anything that doesn't suit them for political or egotistical reasons. This is why the hacking of the RadeonHD repository occured, and why the reaction of the X.org community was aimed against the messenger and victim, and not against the evil-doers (I was pretty certain before I sent that emial that it was either one or the other, i hadn't forseen that it would be both though).

So when the idea of Lima formed in March 2011, I no longer just feared that these people would react. I knew for a fact that they would be there, ready to pounce on the first misstep made. This fed my frustration with how Lima was being treated by Codethink, as I knew that, in the end, I would be the one paying the price again. Similarly, if my code was not good enough, these individuals would think nothing of it to step in, rewrite history, and then trash me for not having done corner X or Y, as that was part of their modus operandi when they destroyed RadeonHD.

So these last few years, i have been caught in a catch-22 situation. ARM GPUs are becoming free, and I have clearly achieved what i set out to achieve. The fact that I so badly misjudged both ARM and the level of support i would get should not take away from the fundamental changes that came from my labour. Logically and pragmatically, it should be pretty hard to fault me in this whole story, but this is not how this game works. No matter what fundamental difference I could've made with lima, no matter how large and clear my contributions and sacrifices would be, these people would still try to distort history and try their utmost best to defame me. And now that moment has come.

But there are a few fundamental differences this time round: I am able to respond as i see fit as there is no corporate structure to shut me up, and I truly have nothing left to lose.

What started this?

To put it short:
12:59 #dri-devel: < Jasper> the alternative is fund an open-source replacement, but touching
                  lima is not an option
Jasper St. Pierre is very vocal on #dri-devel, constantly coming up with questions and asking for help or guidance. But he and his employer Endless Mobile seem to have their mind set on using the ARM Mali DDK, ARMs binary driver. He is working on integrating the Mali DDK with things like wayland, and he is very active in getting help on doing so. He has been doing so for several months now.

The above irc quote is not what i initially responded to though, the following is however:
13:01 #dri-devel: < Jasper> lima is tainted
13:01 #dri-devel: < Jasper> there's still code somewhere on a certain person's hard disk he
                  won't release because he's mad about things and the other project contributors
                  all quit
13:01 #dri-devel: < Jasper> you know that
13:02 #dri-devel: < Jasper> i'm not touching lima with a ten foot pole.
This is pure unadulterated slander.

Lima is not in any way tainted. It is freshly written MIT code and I was never exposed to original ARM code. This statement is an outrageous lie, and it is amazing that anyone in the open source world would lie like that and expect to get away with it.

Secondly, I am not "mad about things". I am severely de-motivated, which is quite different. But even then, apart from me not producing much code, there had been no real indication to Jasper St Pierre until that point that this were the case. Also, "all" the contributors to Lima did not quit. I am still there, albeit not able to produce much in the way of code atm. Connor is in college now and just spent the summer interning for Intel writing a new IR for mesa (and was involved with Khronos SPIR-V). Ben Brewer only worked on Lima on codethink's time, and that was stopped halfway 2012. Those were the only major contributors to Lima. The second line Jasper said there was another baseless lie.

Here is another highly questionable statement from Mr. St Pierre:
13:05 #dri-devel: < Jasper> robclark, i think it's too late tbh -- any money we throw at lima
                  would be tainted i think
What Jasper probably meant is that HE is the one who is tainted by having worked with the ARM Mali DDK directly. But his quick fire statements at 13:01 definitely could've fooled anyone into something entirely different. Anyone who reads that sees "Lima is tainted because libv is a total asshole". This is how things worked during RadeonHD as well. Throw some halftruths out together, and let the internet noise machine do the rest.

Here is an example of what happens then:
13:11 #dri-devel: < EdB> Jasper: what about the open source driver that try to support the next
                  generation (I don't remember the name). Is it also tainted ?
13:19 #dri-devel: < EdB> I never heard of that lima tainted things before and I was surprised
Notice that Jasper made absolutely no effort to properly clarify his statements afterwards. This is where i came in to at least try to defuse this baseless shitthrowing.

When Jasper was confronted, his statements became even more, let's call it, varied. Suddenly, the reason for not using lima is stated here:
13:53 #dri-devel: < Jasper> Lima wasn't good enough in a performance evaluation and since it
                  showed no activity for a few years, we decided to pay for a DDK in order to 
                  ship a product instead of working on Lima instead.
To me, that sounds like a pretty complete reason as to why Endless Mobile chose not to use Lima at the time. It does however not exclude trying to help further Lima at the same time, or later on, like now. But then the following statements came out of Jasper...
13:56 #dri-devel: < Jasper> libv, if we want to contribute to Lima, how can we do so?
13:57 #dri-devel: < Jasper> We'd obviously love an active open-source driver for Mali.
...
13:58 #dri-devel: < Jasper> I didn't see any way to contribute money through http://limadriver.org/
13:58 #dri-devel: < Jasper> Or have consulting done
Good to know... But then...
13:58 #dri-devel: < libv> Jasper: you or your company could've asked.
13:59 #dri-devel: < Jasper> My understanding was that we did.
13:59 #dri-devel: < Jasper> But that happened before I arrived.
13:59 #dri-devel: < libv> definitely not true.
13:59 #dri-devel: < Jasper> I was told we tried to reach out to the Lima project with no response.
                  I don't know how or what happened.
14:00 #dri-devel: < libv> i would remember such an extremely rare event
The supposed reaching out from EndlessM to lima seems pretty weird when taking the earlier statements into account, like at 13:53 and especially at 13:01. Why would anyone make those statements and then turn around and offer to help. What is Jasper hiding?

After that I couldn't keep my mouth shut and voiced my own frustration with how demotivated i am on doing the cleanup and release, and whined about some possible reasons as to why that is so. And then, suddenly...
14:52 #dri-devel: < Jasper> libv, so the reason we didn't contribute to Lima was because we didn't
                  imagine anything would come of it.
14:53 #dri-devel: < Jasper> It seems we were correct.
It seems that now Jasper his memory was refreshed, again, and that EndlessM never did reach out for another reason altogether...

A bit further down, Jasper lets the following slip:
14:57 #dri-devel: < Jasper> libv, I know about the radeonhd story. jrb has told me quite a lot.
14:58 #dri-devel: < Jasper> Jonathan Blandford. My boss at Endless, and my former boss at Red Hat.
And suddenly the whole thing becomes clear. In the private conversation that ensued Jasper explained that he is good friends with exactly the 3 persons named before, and that those 3 persons told him all he wanted to know about the RadeonHD project. It also seems that Jonathan Blandford was, in some way, involved with Red Hats decision to sign a pact with the ATI devil to produce a fork of a proper open source driver for AMD. These people simply never will go near anything I do, and would rather spend their time slandering yours truly than to actually contribute or provide support, and this is exactly what has been happening here too.

This brings us straight back to what was stated before anything else i quoted so far:
12:57 #dri-devel: < EdB> Jasper: I guess lima is to be avoided
...
12:58 #dri-devel: < Jasper> EdB, yeah, but that's not for legal reasons
Not for legal reasons indeed.

But the lies didn't end there. Next to coming up with more reasons to not contribute to lima, in that same private conversation Jasper "kindly" suggested that he'd continue Lima if only i throw my code over the wall. Didn't he state previously that Lima was tainted, with which i think he meant that he was tainted and couldn't work on lima?

Why?

At the time, i knew i was being played and bullshitted, but I couldn't make it stick on a single statement alone in the maelstrom that is IRC. It is only after rereading the irc log and putting Jaspers statements next to eachother that the real story becomes clear. You might feel that i am reading too much into this, and that i have deliberately taken Jaspers statements out of context to make him look bad. But I implore you to verify the above with the irc log, and you will see that i have left very little necessary context out. The above is Jaspers statements, with my commentary, put closely together, removing the noise (which is a really bad description for the rest of the irc discussion that went on at that time) and boosting the signal. Jasper statements are highly erratic, they make no sense when they are considered collectively as truths, and some are plain lies when considered individually.

Jasper has been spending a lot of time asking for help for supporting a binary driver. Nobody in the business does that. I guess that (former) red hat people are special. I still have not understood why it is that Red Hat is able to get away with so much while companies like canonical and SuSE get trashed so easily. But that is besides the point here; no-one questioned why Jasper is wasting so much time of open source developers on a binary driver. Jasper was indirectly asked why he was not using Lima, nothing more.

If Jasper had just stated that Endless Mobile decided to not use Lima because Endless Mobile thought Lima was not far enough along yet, and that Endless Mobile did not have the resources to do or fund all the necessary work still, then i think that that would've been an acceptable answer for EdB. It's not something i like hearing, but i am quite used to this by now. Crucially though, i would not have been in a position to complain.

Jasper instead went straight to trashing me and my work, with statements that can only be described as slander. When questioned, he then gave a series of mutually exclusive statements, which can only be explained as "making things up as he goes along", working very hard to mask the original reasons for not supporting Lima at all.

This is more than just "Lima is not ready", otherwise Jasper would have been comfortable sticking to that story. Jasper, his boss and Endless Mobile did not decide against Lima on any technical or logistical basis. This whole story never was about code at all. These people simply would do anything but support something that involves me, no matter how groundbreaking, necessary or worthy my work is. They would much rather trash and re-invent, and then trash some more...

It of course did not take long for the other vultures to join the frenzy... When the discussion on irc was long over, Daniel Stone made one attempt to revive it, to no avail. Then, a day later, someone anonymously informed phoronix, and Dave Airlie could then clearly be seen to try to stoke the fire, again, with limited success. I guess the fact that most of the phoronix users still only care about desktop hardware played to my advantage here (for once). It does very clearly show how these people work though, and I know that they will continue to try to play this card, trying hard to trigger another misstep from me or a cause a proper shitstorm on the web.

So what now.

Now I have a choice. I can wait until i am in the mood to clean up this code and produce something useful for public consumption, and in the meantime see my name and my work slandered. Or... If i throw my (nasty) code over the wall, someone will rename/rewrite it, mess up a lot of things, and trash me for not having done corner X or Y, so essentially slander me and my work and probably even erase my achievements from history, by using more of my work against me... And in the latter case i give people like Jasper and companies like Endless Mobile what they want, in spite of their contemptible actions and statements. Logically, there is only one conclusion possible in that constellation.

At the end of the day, i am still the author of this code and it was my time that I burned on it, my life that i wasted on it. I have no obligations to anyone, and am free to do with my code what i want. I cannot be forced into anything. I will therefor release my work, under my own terms, when i want to and when i feel ok doing so.

This whole incident did not exactly motivate me to spend time on cleaning code up and releasing it. At best, it confirmed my views on open source software and certain parts of the X.org community.

This game never was about code or about doing The Right Thing, and never will be.
April 22, 2015

I've had a half-broken temperature monitoring setup at home for quite some time. It started out with a Atom-based NAS, a USB-serial adapter and a passive 1-wire adapter. It sometimes worked, then stopped working, then started when poked with a stick. Later, the NAS was moved under the stairs and I put a Beaglebone Black in its old place. The temperature monitoring thereafter never really worked, but I didn't have the time to fix it. Over the last few days, I've managed to get it working again, of course by replacing nearly all the existing components.

I'm using the DS18B20 sensors. They're about USD 1 a piece on Ebay (when buying small quantities) and seems to work quite ok.

My first task was to address the reliability problems: Dropouts and really poor performance. I thought the passive adapter was problematic, in particular with the wire lengths I'm using and I therefore wanted to replace it with something else. The BBB has GPIO support, and various blog posts suggested using that. However, I'm running Debian on my BBB which doesn't have support for DTB overrides, so I needed to patch the kernel DTB. (Apparently, DTB overrides are landing upstream, but obviously not in time for Jessie.)

I've never even looked at Device Tree before, but the structure was reasonably simple and with a sample override from bonebrews it was easy enough to come up with my patch. This uses pin 11 (yes, 11, not 13, read the bonebrews article for explanation on the numbering) on the P8 block. This needs to be compiled into a .dtb. I found the easiest way was just to drop the patched .dts into an unpacked kernel tree and then running make dtbs.

Once this works, you need to compile the w1-gpio kernel module, since Debian hasn't yet enabled that. Run make menuconfig, find it under "Device drivers", "1-wire", "1-wire bus master", build it as a module. I then had to build a full kernel to get the symversions right, then build the modules. I think there is or should be an easier way to do that, but as I cross-built it on a fast AMD64 machine, I didn't investigate too much.

Insmod-ing w1-gpio then works, but for me, it failed to detect any sensors. Reading the data sheet, it looked like a pull-up resistor on the data line was needed. I had enabled the internal pull-up, but apparently that wasn't enough, so I added a 4.7kOhm resistor between pin 3 (VDD_3V3) on P9 and pin (GPIO_45) on P8. With that in place, my sensors showed up in /sys/bus/w1/devices and you can read the values using cat.

In my case, I wanted the data to go into collectd and then to graphite. I first tried using an Exec plugin, but never got it to work properly. Using a [python plugin] worked much better and my graphite installation is now showing me temperatures.

Now I just need to add more probes around the house.

The most useful references were

In addition, various searches for DS18B20 pinout and similar, of course.

April 21, 2015

presenter modeGot a talk coming up? Want it to go well? Here’s some starting points.

I give a lot of talks. Often I’m paid to give them, and I regularly get very high ratings or even awards. But every time I listen to people speaking in public for the first time, or maybe the first few times, I think of some very easy ways for them to vastly improve their talks.

Here, I wanted to share my top tips to make your life (and, selfishly, my life watching your talks) much better:

  1. Presenter mode is the greatest invention ever. Use it. If you ignore or forget everything else in this post, remember the rainbows and unicorns of presenter mode. This magical invention keeps the current slide showing on the projector while your laptop shows something different — the current slide, a small image of the next slide, and your slide notes. The last bit is the key. What I put on my notes is the main points of the current slide, followed by my transition to the next slide. Presentations look a lot more natural when you say the transition before you move to the next slide rather than after. More than anything else, presenter mode dramatically cut down on my prep time, because suddenly I no longer had to rehearse. I had seamless, invisible crib notes while I was up on stage.
  2. Plan your intro. Starting strong goes a long way, as it turns out that making a good first impression actually matters. It’s time very well spent to literally script your first few sentences. It helps you get the flow going and get comfortable, so you can really focus on what you’re saying instead of how nervous you are. Avoid jokes unless most of your friends think you’re funny almost all the time. (Hint: they don’t, and you aren’t.)
  3. No bullet points. Ever. (Unless you’re an expert, and you probably aren’t.) We’ve been trained by too many years of boring, sleep-inducing PowerPoint presentations that bullet points equal naptime. Remember presenter mode? Put the bullet points in the slide notes that only you see. If for some reason you think you’re the sole exception to this, at a minimum use visual advances/transitions. (And the only good transition is an instant appear. None of that fading crap.) That makes each point appear on-demand rather than all of them showing up at once.
  4. Avoid text-filled slides. When you put a bunch of text in slides, people inevitably read it. And they read it at a different pace than you’re reading it. Because you probably are reading it, which is incredibly boring to listen to. The two different paces mean they can’t really focus on either the words on the slide or the words coming out of your mouth, and your attendees consequently leave having learned less than either of those options alone would’ve left them with.
  5. Use lots of really large images. Each slide should be a single concept with very little text, and images are a good way to force yourself to do so. Unless there’s a very good reason, your images should be full-bleed. That means they go past the edges of the slide on all sides. My favorite place to find images is a Flickr advanced search for Creative Commons licenses. Google also has this capability within Search Tools. Sometimes images are code samples, and that’s fine as long as you remember to illustrate only one concept — highlight the important part.
  6. Look natural. Get out from behind the podium, so you don’t look like a statue or give the classic podium death-grip (one hand on each side). You’ll want to pick up a wireless slide advancer and make sure you have a wireless lavalier mic, so you can wander around the stage. Remember to work your way back regularly to check on your slide notes, unless you’re fortunate enough to have them on extra monitors around the stage. Talk to a few people in the audience beforehand, if possible, to get yourself comfortable and get a few anecdotes of why people are there and what their background is.
  7. Don’t go over time. You can go under, even a lot under, and that’s OK. One of the best talks I ever gave took 22 minutes of a 45-minute slot, and the rest filled up with Q&A. Nobody’s going to mind at all if you use up 30 minutes of that slot, but cutting into their bathroom or coffee break, on the other hand, is incredibly disrespectful to every attendee. This is what watches, and the timer in presenter mode, and clocks, are for. If you don’t have any of those, ask a friend or make a new friend in the front row.
  8. You’re the centerpiece. The slides are a prop. If people are always looking at the slides rather than you, chances are you’ve made a mistake. Remember, the focus should be on you, the speaker. If they’re only watching the slides, why didn’t you just post a link to Slideshare or Speakerdeck and call it a day?

I’ve given enough talks that I have a good feel on how long my slides will take, and I’m able to adjust on the fly. But if you aren’t sure of that, it might make sense to rehearse. I generally don’t rehearse, because after all, this is the lazy way.

If you can manage to do those 8 things, you’ve already come a long way. Good luck!


Tagged: communication, gentoo

A few months ago, I wrote a long post about what I called back then the "Gnocchi experiment". Time passed and we – me and the rest of the Gnocchi team – continued to work on that project, finalizing it.

It's with a great pleasure that we are going to release our first 1.0 version this month, roughly at the same time that the integrated OpenStack projects release their Kilo milestone. The first release candidate numbered 1.0.0rc1 has been released this morning!

The problem to solve

Before I dive into Gnocchi details, it's important to have a good view of what problems Gnocchi is trying to solve.

Most of the IT infrastructures out there consists of a set of resources. These resources have properties: some of them are simple attributes whereas others might be measurable quantities (also known as metrics).

And in this context, the cloud infrastructures make no exception. We talk about instances, volumes, networks… which are all different kind of resources. The problems that are arising with the cloud trend is the scalability of storing all this data and being able to request them later, for whatever usage.

What Gnocchi provides is a REST API that allows the user to manipulate resources (CRUD) and their attributes, while preserving the history of those resources and their attributes.

Gnocchi is fully documented and the documentation is available online. We are the first OpenStack project to require patches to integrate the documentation. We want to raise the bar, so we took a stand on that. That's part of our policy, the same way it's part of the OpenStack policy to require unit tests.

I'm not going to paraphrase the whole Gnocchi documentation, which covers things like installation (super easy), but I'll guide you through some basics of the features provided by the REST API. I will show you some example so you can have a better understanding of what you could leverage using Gnocchi!

Handling metrics

Gnocchi provides a full REST API to manipulate time-series that are called metrics. You can easily create a metric using a simple HTTP request:

POST /v1/metric HTTP/1.1
Content-Type: application/json
 
{
"archive_policy_name": "low"
}
 
HTTP/1.1 201 Created
Location: http://localhost/v1/metric/387101dc-e4b1-4602-8f40-e7be9f0ed46a
Content-Type: application/json; charset=UTF-8
 
{
"archive_policy": {
"aggregation_methods": [
"std",
"sum",
"mean",
"count",
"max",
"median",
"min",
"95pct"
],
"back_window": 0,
"definition": [
{
"granularity": "0:00:01",
"points": 3600,
"timespan": "1:00:00"
},
{
"granularity": "0:30:00",
"points": 48,
"timespan": "1 day, 0:00:00"
}
],
"name": "low"
},
"created_by_project_id": "e8afeeb3-4ae6-4888-96f8-2fae69d24c01",
"created_by_user_id": "c10829c6-48e2-4d14-ac2b-bfba3b17216a",
"id": "387101dc-e4b1-4602-8f40-e7be9f0ed46a",
"name": null,
"resource_id": null
}


The archive_policy_name parameter defines how the measures that are being sent are going to be aggregated. You can also define archive policies using the API and specify what kind of aggregation period and granularity you want. In that case , the low archive policy keeps 1 hour of data aggregated over 1 second and 1 day of data aggregated to 30 minutes. The functions used for aggregations are the mathematical functions standard deviation, minimum, maximum, … and even 95th percentile. All of that is obviously customizable and you can create your own archive policies.

If you don't want to specify the archive policy manually for each metric, you can also create archive policy rule, that will apply a specific archive policy based on the metric name, e.g. metrics matching disk.* will be high resolution metrics so they will use the high archive policy.

It's also worth noting Gnocchi is precise up to the nanosecond and is not tied to the current time. You can manipulate and inject measures that are years old and precise to the nanosecond. You can also inject points with old timestamps (i.e. old compared to the most recent one in the timeseries) with an archive policy allowing it (see back_window parameter).

It's then possible to send measures to this metric:

POST /v1/metric/387101dc-e4b1-4602-8f40-e7be9f0ed46a/measures HTTP/1.1
Content-Type: application/json
 
[
{
"timestamp": "2014-10-06T14:33:57",
"value": 43.1
},
{
"timestamp": "2014-10-06T14:34:12",
"value": 12
},
{
"timestamp": "2014-10-06T14:34:20",
"value": 2
}
]

HTTP/1.1 204 No Content


These measures are synchronously aggregated and stored into the configured storage backend. Our most scalable storage drivers for now are either based on Swift or Ceph which are both scalable storage objects systems.

It's then possible to retrieve these values:

GET /v1/metric/387101dc-e4b1-4602-8f40-e7be9f0ed46a/measures HTTP/1.1
 
HTTP/1.1 200 OK
Content-Type: application/json; charset=UTF-8
 
[
[
"2014-10-06T14:30:00.000000Z",
1800.0,
19.033333333333335
],
[
"2014-10-06T14:33:57.000000Z",
1.0,
43.1
],
[
"2014-10-06T14:34:12.000000Z",
1.0,
12.0
],
[
"2014-10-06T14:34:20.000000Z",
1.0,
2.0
]
]


As older Ceilometer users might notice here, metrics are only storing points and values, nothing fancy such as metadata anymore.

By default, values eagerly aggregated using mean are returned for all supported granularities. You can obviously specify a time range or a different aggregation function using the aggregation, start and stop query parameter.

Gnocchi also supports doing aggregation across aggregated metrics:

GET /v1/aggregation/metric?metric=65071775-52a8-4d2e-abb3-1377c2fe5c55&metric=9ccdd0d6-f56a-4bba-93dc-154980b6e69a&start=2014-10-06T14:34&aggregation=mean HTTP/1.1
 
HTTP/1.1 200 OK
Content-Type: application/json; charset=UTF-8
 
[
[
"2014-10-06T14:34:12.000000Z",
1.0,
12.25
],
[
"2014-10-06T14:34:20.000000Z",
1.0,
11.6
]
]


This computes the mean of mean for the metric 65071775-52a8-4d2e-abb3-1377c2fe5c55 and 9ccdd0d6-f56a-4bba-93dc-154980b6e69a starting on 6th October 2014 at 14:34 UTC.

Indexing your resources

Another object and concept that Gnocchi provides is the ability to manipulate resources. There is a basic type of resource, called generic, which has very few attributes. You can extend this type to specialize it, and that's what Gnocchi does by default by providing resource types known for OpenStack such as instance, volume, network or even image.

POST /v1/resource/generic HTTP/1.1
 
Content-Type: application/json
 
{
"id": "75C44741-CC60-4033-804E-2D3098C7D2E9",
"project_id": "BD3A1E52-1C62-44CB-BF04-660BD88CD74D",
"user_id": "BD3A1E52-1C62-44CB-BF04-660BD88CD74D"
}
 
HTTP/1.1 201 Created
Location: http://localhost/v1/resource/generic/75c44741-cc60-4033-804e-2d3098c7d2e9
ETag: "e3acd0681d73d85bfb8d180a7ecac75fce45a0dd"
Last-Modified: Fri, 17 Apr 2015 11:18:48 GMT
Content-Type: application/json; charset=UTF-8
 
{
"created_by_project_id": "ec181da1-25dd-4a55-aa18-109b19e7df3a",
"created_by_user_id": "4543aa2a-6ebf-4edd-9ee0-f81abe6bb742",
"ended_at": null,
"id": "75c44741-cc60-4033-804e-2d3098c7d2e9",
"metrics": {},
"project_id": "bd3a1e52-1c62-44cb-bf04-660bd88cd74d",
"revision_end": null,
"revision_start": "2015-04-17T11:18:48.696288Z",
"started_at": "2015-04-17T11:18:48.696275Z",
"type": "generic",
"user_id": "bd3a1e52-1c62-44cb-bf04-660bd88cd74d"
}


The resource is created with the UUID provided by the user. Gnocchi handles the history of the resource, and that's what the revision_start and revision_end fields are for. They indicates the lifetime of this revision of the resource. The ETag and Last-Modified headers are also unique to this resource revision and can be used in a subsequent request using If-Match or If-Not-Match header, for example:

GET /v1/resource/generic/75c44741-cc60-4033-804e-2d3098c7d2e9 HTTP/1.1
If-Not-Match: "e3acd0681d73d85bfb8d180a7ecac75fce45a0dd"
 
HTTP/1.1 304 Not Modified


Which is useful to synchronize and update any view of the resources you might have in your application.

You can use the PATCH HTTP method to modify properties of the resource, which will create a new revision of the resource. The history of the resources are available via the REST API obviously.

The metrics properties of the resource allow you to link metrics to a resource. You can link existing metrics or create new ones dynamically:

POST /v1/resource/generic HTTP/1.1
Content-Type: application/json
 
{
"id": "AB68DA77-FA82-4E67-ABA9-270C5A98CBCB",
"metrics": {
"temperature": {
"archive_policy_name": "low"
}
},
"project_id": "BD3A1E52-1C62-44CB-BF04-660BD88CD74D",
"user_id": "BD3A1E52-1C62-44CB-BF04-660BD88CD74D"
}
 
HTTP/1.1 201 Created
Location: http://localhost/v1/resource/generic/ab68da77-fa82-4e67-aba9-270c5a98cbcb
ETag: "9f64c8890989565514eb50c5517ff01816d12ff6"
Last-Modified: Fri, 17 Apr 2015 14:39:22 GMT
Content-Type: application/json; charset=UTF-8
 
{
"created_by_project_id": "cfa2ebb5-bbf9-448f-8b65-2087fbecf6ad",
"created_by_user_id": "6aadfc0a-da22-4e69-b614-4e1699d9e8eb",
"ended_at": null,
"id": "ab68da77-fa82-4e67-aba9-270c5a98cbcb",
"metrics": {
"temperature": "ad53cf29-6d23-48c5-87c1-f3bf5e8bb4a0"
},
"project_id": "bd3a1e52-1c62-44cb-bf04-660bd88cd74d",
"revision_end": null,
"revision_start": "2015-04-17T14:39:22.181615Z",
"started_at": "2015-04-17T14:39:22.181601Z",
"type": "generic",
"user_id": "bd3a1e52-1c62-44cb-bf04-660bd88cd74d"
}


Haystack, needle? Find!

With such a system, it becomes very easy to index all your resources, meter them and retrieve this data. What's even more interesting is to query the system to find and list the resources you are interested in!

You can search for a resource based on any field, for example:

POST /v1/search/resource/instance HTTP/1.1
Content-Type: application/json
 
{
"=": {
"user_id": "bd3a1e52-1c62-44cb-bf04-660bd88cd74d"
}
}


That query will return a list of all resources owned by the user_id bd3a1e52-1c62-44cb-bf04-660bd88cd74d.

You can do fancier queries such as retrieving all the instances started by a user this month:

POST /v1/search/resource/instance HTTP/1.1
Content-Type: application/json
Content-Length: 113
 
{
"and": [
{
"=": {
"user_id": "bd3a1e52-1c62-44cb-bf04-660bd88cd74d"
}
},
{
">=": {
"started_at": "2015-04-01"
}
}
]
}


And you can even do fancier queries than the fancier ones (still following?). What if we wanted to retrieve all the instances that were on host foobar the 15th April and who had already 30 minutes of uptime? Let's ask Gnocchi to look in the history!

POST /v1/search/resource/instance?history=true HTTP/1.1
Content-Type: application/json
Content-Length: 113
 
{
"and": [
{
"=": {
"host": "foobar"
}
},
{
">=": {
"lifespan": "1 hour"
}
},
{
"<=": {
"revision_start": "2015-04-15"
}
}
 
]
}


I could also mention the fact that you can search for value in metrics. One feature that I will very likely include in Gnocchi 1.1 is the ability to search for resource whose specific metrics matches some value. For example, having the ability to search for instances whose CPU consumption was over 80% during a month.

Cherries on the cake

While Gnocchi is well integrated and based on common OpenStack technology, please do note that it is completely able to function without any other OpenStack component and is pretty straight-forward to deploy.

Gnocchi also implements a full RBAC system based on the OpenStack standard oslo.policy and which allows pretty fine grained control of permissions.

There is also some work ongoing to have HTML rendering when browsing the API using a Web browser. While still simple, we'd like to have a minimal Web interface served on top of the API for the same price!

Ceilometer alarm subsystem supports Gnocchi with the Kilo release, meaning you can use it to trigger actions when a metric value crosses some threshold. And OpenStack Heat also supports auto-scaling your instances based on Ceilometer+Gnocchi alarms.

And there are a few more API calls that I didn't talk about here, so don't hesitate to take a peek at the full documentation!

Towards Gnocchi 1.1!

Gnocchi is a different beast in the OpenStack community. It is under the umbrella of the Ceilometer program, but it's one of the first projects that is not part of the (old) integrated release. Therefore we decided to have a release schedule not directly linked to the OpenStack and we'll release more often that the rest of the old OpenStack components – probably once every 2 months or the like.

What's coming next is a close integration with Ceilometer (e.g. moving the dispatcher code from Gnocchi to Ceilometer) and probably more features as we have more requests from our users. We are also exploring different backends such as InfluxDB (storage) or MongoDB (indexer).

Stay tuned, and happy hacking!

April 20, 2015

So I came across this very positive review of Fedora Workstation on linux.com, although it was billed as a review of GNOME 3.16. Which is of course correct, in the sense that the desktop we are going to ship on Fedora Workstation 22 is GNOME 3.16. But I wanted to take the opportunity to highlight that when you look at Fedora Workstation it is a complete and integrated operating system. As I mentioned in a blog post about Fedora Workstation last April the core idea for Fedora Workstation is to stop treating the operating system as a bag of parts and instead look at it as a whole, because if we are to drain the swamp here we need to solve the major issues that people face with their desktop regardless of if that issue is in the kernel, the graphics drivers, glibc or elsewhere in your system. We can not just look at the small subset of packages that provides the chrome for your user interface in isolation.

This is why we are working on reliable firmware upgrades for your UEFI motherboard by participating in the UEFI working group and adding functionality in GNOME Software to handle doing firmware updates.

This is why we recently joined Khronos to make sure the standards for doing 3D on Linux are good and open source friendly.

This is why we been working so hard on improving coverage of Appdata metadata coverage, well beyond the confines of ‘GNOME’ software.

This is why we have Richard Hughes and Owen Taylor working on how we can improve battery life when running
Fedora or RHEL on laptops.

This is why we created dnf to replace yum, to get a fast and efficient package update system.

This is why we are working on an Adwaita theme for Qt

And this is why we are pushing hard forward with a lot of other efforts like Wayland, libinput, Fleet Commander, Boxes and more.

So when you look at the user experience you get on Fedora Workstation, remember that it is not just a question of which version of GNOME we are shipping, but it is the fact that we are working hard on putting together a tightly vertically integrated and tested system from the kernel up to core desktop applications.

Anyone who has been using Fedora for a long while knows that this change was a major change in philosophy and approach for the project, as Fedora up to the 21 release of the 3 new products was very much defined by the opposite, being all about the lego blocks, which contributed to the image of Fedora being a bleeding edge system where you should be prepared to do a lot of bleeding and where you probably wanted to keep your toolbox with you at all times in case something broke. So I have to say that I am mightily impressed by how the Fedora community has taken to this major change where we now are instead focusing our efforts on our 3 core products and are putting a lot of effort into creating stuff that is polished and reliable, and which aims to be leading edge instead of bleeding edge.

So with all this in mind I was a little disappointed when the reviewer writing the article in question ended his review by saying he was now waiting for GNOME 3.16 to appear in Ubuntu GNOME, because there is no guarantees that he would get the same overall user experience in Ubuntu GNOME that we have developed for Fedora Workstation, which is the user experience his review reflects.

Anyway, I thought this could be a good opportunity to actually ask the wider community a question, especially if you are using GNOME on another distribution than Fedora, what are we still missing at this point for you to consider making a switch to Fedora Workstation? I know that for some of you the answer might be as simple as ‘worn in shoes fits the best’, but anything you might have beyond that would be great to hear.
I can’t promise that we will be able to implement every suggestion you add to this blog post, but I do promise that we will review and consider every suggestion you provide and try to see how it can fit into development plans going forward.

With Linux kernel v3.20^W v4.0 already out there door my overview of what's in 4.1 for drm/i915 is way overdue.

First looking at the modeset side of the driver the big overall thing is all the work to convert i915 to atomic. In this release there's code from Ander Conselvan de Oliveira to have a struct drm_atomic_state allocated for all the legacy modeset code paths in the driver. With that we can switch the internals to start using atomic state objects and gradually convert over everything to atomic on the modeset side. Matt Roper on the other hand was busy to prepare the plane code to land the atomic watermark update code. Damien has reworked the initial plane configuration code used for fastboot, which also needs to be adapted to the atomic world.


For more specific feature work there's the DRRS (dynamic refresh rate switching) from Sonika, Vandana and more people, which is now enabled where supported. The idea is to reduce the refresh rate of the panel to save power when nothing changes on the screen. And Paulo Zanoni has provided patches to improve the FBC code, hopefully we can enable that by default soon too. Under the hood Ville has refactored the DP link rate computation and the sprite color key handling, both to prepare for future work and platform enabling. Intermediate link rate support for eDP 1.4 from Sonika built on top of this. Imre Deak has also reworked the Baytrail/Braswell DPLL code to prepare for Broxton.

Speaking of platforms, Skyleigh has gained runtime PM support from Damien, and RPS (render turbo and sleep states) from Akash. Another SKL exclusive is support for scanout of Y-tiled buffers and scanning out buffers rotated by 90°/270° (instead of just normal and rotated by 180°) from Tvrtko and Damien. Well the rotation support didn't quite land yet, but Tvrtko's support for the special pagetable binding needed for that feature in the form of rotated GGTT views. Finally Nick Hoath and Damien also submitted a lot of workaround patches for SKL.

Moving on to Braswell/Cherryview there have been tons of fixes to the DPLL and watermark code from Vijay and Ville, and BSW left the preliminary hw support. And also for the SoC platforms Chris Wilson has supplied a pile of patches to tune the rps code and bring it more in-line with the big core platforms.

On the GT side the big ongoing work is dyanmic pagetable allocations Michel Thierry based upon patches from Ben Widawsky. With per-process address spaces and even more so with the big address spaces gen8+ supports it would be wasteful if not impossible to allocate pagetables for the entire address space upfront. But changing the code to handle another possible memory allocation failure point needed a lot of work. Most of that has landed now, but the benefits of enabling bigger address spaces haven't made it into 4.1.

Another big work is XenGT client-side support fromYu Zhang and team. This is paravirtualization to allow virtual machines to tap into the render engines without requiring exclusive access, but also with a lot less overhead than full virtual hardware like vmware or virgil would provide. The host-side code is also submitted already, but needs a bit more work still to integrate cleanly into the driver.

And of course there's been lots of other smaller work all over, as usual. Internal documentation for the shrinker, more dead UMS code removed, the vblank interrupt code cleaned up and more.
April 18, 2015

So the Endless Computer kickstarter just succeeded in their funding goal of 100K USD. A big heartfelt congratulations to the whole team and I am looking forward to receiving my system. For everyone else out there I strongly recommend getting in on their kickstarter, not only do you get a cool looking computer with a really nice Linux desktop, you are helping a company forward that has the potential to take the linux dektop to the next level. And the be aware that the computer is a standard computer (yet very cool looking) at the end of the day, so if you want to install Fedora Workstation on it you can :)

April 17, 2015

So Red Hat are now formally a member of the Khronos Groups who many of probably know as the shepherds of the OpenGL standard. We haven’t gotten all the little bits sorted yet, like getting our logo on the Khronos website, but our engineers are signing up for the various Khronos working groups etc. as we speak.

So the reason we are joining is because of all the important changes that are happening in Graphics and GPU compute these days and our wish to have more direct input of the direction of some of these technologies. Our efforts are likely to focus on improving the OpenGL specification by proposing some new extensions to OpenGL, and of course providing input and help with moving the new Vulkan standard forward.

So well known Red Hat engineers such as Dave Airlie, Adam Jackson, Rob Clark and others will from now on play a much more direct role in helping shape the future of 3D Graphics standards. And we really look forward to working with our friends and colleagues at Nvidia, AMD, Intel, Valve and more inside Khronos.

April 16, 2015

I am following the discussion caused by Greg Kroah-Hartman requesting that kdbus be pulled into the next kernel release. First of all my hat of to Greg for his persistence and staying civil. There has already been quite a few posts to the thread at coming close to attempts at character assassination and a lot of emails just adding more noise, but no signal.

One point I feel is missing from the discussion here though is the question of not making the perfect the enemy of the good. A lot of the posters are saying that ‘hey, you should write something perfect here instead of what you have currently’. Which sounds reasonable on some level, but when you think of it is more a deflection strategy than a real suggestion. First of all the is no such thing as perfect. And secondly if there was how long would it take to provide the theoretical perfect thing? 2 more years, 4 years, 10 years?

Speaking as someone involved in making an operating system I can say that we would have liked to have kdbus 5 years ago and would much prefer to get kdbus in the next few Months, than getting something ‘perfect’ in 5 years from now.

So you can say ‘hey, you are creating a strawman here, nobody used the word ‘perfect” in the discussion. Sure, nobody used that word, but a lot of messages was written about how ‘something better’ should be created here instead. Yet, based on from where these people seemed to be coming the question I ask then is: Better for who? Better for the developers who are already using dbus in the applications or desktops? Better for a kernel developer who is never going to use it? Better for someone doing code review? Better for someone who doesn’t need the features of dbus, but who would want something else?

And what is ‘better’ anyway? Greg keeps calling for concrete technical feedback, but at the end of the day there is a lot of cases where the ‘best’ technical solution, to the degree you can measure that objectively, isn’t actually ‘the best’. I mean if I came up with a new format for storing multimedia on an optical disk, one which from a technical perspective is ‘better’ than the current Blu-Ray spec, that doesn’t mean it is actually better for the general public. Getting a non-standard optical disc that will not play in your home Blu-Ray player isn’t better for 99.999% of people, regardless of the technical merit of the on-disc data format.

Something can actually be ‘better’ just because it is based on something that already exists, something which have a lot of people already using it, lets people quickly extend what they already are doing with the new functionality without needing a re-write and something which is available ‘now’ as opposed to some undefined time in the future. And that is where I feel a lot of the debaters on the lkml are dropping the ball in this discussion; they just keeping asking for a ‘better solution’ to the challenges of a space they often don’t have any personal experience with developing in, because kdbus doesn’t conform to how they would implement a kernel IPC mechanism for the kind of problems they are used to trying to solve.

Also there has been a lot of arguments about the ‘design’ of kdbus and dbus. A lot of lacking concreteness to it and mostly coming from people very far removed from working on the desktop and other relevant parts of userspace. Which at the end of the day boils down to trying to make the lithmus test ‘you have to prove to me that making a better design is impossible’, and I think that anyone should be able to agree that if that was the test for adding anything to the Linux kernel or elsewhere then very little software would ever get added anywhere. In fact if we where to hold to that kind of argumentation we might as well leave software development behind and move to an online religion discussion forum tossing the good ol’ “prove to me that God doesn’t exist’ into the ring.

So in the end I hope Linus, who is the final arbiter here, doesn’t end up making the ‘perfect’ the enemy of the good.

April 14, 2015

So George R.R. Martin has spent quite a bit of time over the last week, and even a more energy, on responding to this thing called Puppygate. Which not directly related got some commonalities with last years gamergate story.

Yet, what strikes me skimming through a plethora of online discussions is that we seemed to have perfected name calling as an argumentation technique on the Internet. And since the terminology jungle can be a bit confusing I thought I should break it down by putting together this list of common names, so that you know which one you are ;) Trying for a tongue in cheek summary here, but if I fail at least you know what names to call me!

  • If you don’t think the world economy is working perfectly you are a Communist
  • If you are not in favor of free immigration you are a Racist
  • If you think women have achieved any level of equality today you are a Misogynist
  • If you are critical to any part of the Christian faith you are a Christian hater or taking part in a War on Christmas
  • If you think anything is well in the world today you are a Straight White Male
  • If you are against any Israeli policies your an Antisemite or Neo-nazi
  • If you feel that gays, women or ethnic minorities are not having their voice heard on an equal level you are a Social Justice Warrior
  • If you are critical of any part of the Quran or Islamic practice you are an Islamophobe

I am sure there are a lots more, these are just a collection of some I seen recently. So while I think there are real issues behind many of these examples I think sadly the Internet seems best for turning anything into a binary question, which surprisingly turns out to not be a super method for finding common ground, and as it turns out if you can associate a name to these things you can turn them into binary questions even faster!

Not that these things are specifically tied to the Internet, long time since I last saw any kind of political debate that felt like there was any time or interest in bringing out the nuances into the discussion; We do live in the age of bumper sticker politics after all.

Anyway, enough musing about the sad state of public discourse :)

April 07, 2015

On Intel's Gen graphics, three source instructions like MAD and LRP cannot have constants as arguments. When support for MAD instructions was introduced with Sandybridge, we assumed the choice between a MOV+MAD and a MUL+ADD sequence was inconsequential, so we chose to perform the multiply and add operations separately. Revisiting that assumption has uncovered some interesting things about the hardware and has lead us to some pretty nice performance improvements.

On Gen 7 hardware (Ivybridge, Haswell, Baytrail), multiplies and adds without immediate value arguments can be co-issued, meaning that multiple instructions can be issued from the same execution unit in the same cycle. MADs, never having immediates as sources, can always be co-issued. Considering that, we should prefer MADs, but a typical vec4 * vec4 + vec4(constant) pattern would lead to three duplicate (four total) MOV imm instructions.

mov(8)  g10<1>F    1.0F
mov(8)  g11<1>F    1.0F
mov(8)  g12<1>F    1.0F
mov(8)  g13<1>F    1.0F
mad(8)  g40<1>F    g10<8,8,1>F   g20<8,8,1>F   g30<8,8,1>F
mad(8)  g41<1>F    g11<8,8,1>F   g21<8,8,1>F   g31<8,8,1>F
mad(8)  g42<1>F    g12<8,8,1>F   g22<8,8,1>F   g32<8,8,1>F
mad(8)  g43<1>F    g13<8,8,1>F   g23<8,8,1>F   g33<8,8,1>F

Should be easy to clean up, right? We should simply combine those 1.0F MOVs and modify the MAD instructions to access the same register. Well, conceptually yes, but in practice not quite.

Since the i965 driver's fragment shader backend doesn't use static single assignment form (it's on our TODO list), our common subexpression elimination pass has to emit a MOV instruction when combining instructions. As a result, performing common subexpression elimination on immediate MOVs would undo constant propagation and the compiler's optimizer would go into an infinite loop. Not what you wanted.

Instead, I wrote a pass that scans the instruction list after the main optimization loop and creates a list of immediate values that are used. If an immediate value is used by a 3-source instruction (a MAD or a LRP) or at least four times by an instruction that can co-issue (ADD, MUL, CMP, MOV) then it's put into a register and sourced from there.

But there's still room for improvement. Each general register can store 8 floats, and instead of storing 8 separate constants in each, we're storing a single constant 8 times (and on SIMD16, 16 times!). Fixing that wasn't hard, and it significantly reduces register usage - we now only use one register for each 8 immediate values. Using a special vector-float immediate type we can even load four floating-point values in a single instruction.

With that in place, we can now always emit MAD instructions.

I'm pretty pleased with the results. Without using the New Intermediate Representation (NIR), the shader-db results are:

total instructions in shared programs: 5895414 -> 5747578 (-2.51%)
instructions in affected programs:     3618111 -> 3470275 (-4.09%)

And with NIR (that already unconditionally emits MAD instructions):

total instructions in shared programs: 7992936 -> 7772474 (-2.76%)
instructions in affected programs:     3738730 -> 3518268 (-5.90%)

Effects on a WebGL microbenchmark

In December, I checked what effect my constant combining pass would have on a WebGL procedural noise demo. The demo generates an effect ("noise") that looks like a ball of fire. Its fragment shader contains a ton of instructions but no texturing operations. We're currently able to compile the program in SIMD8 without spilling any registers, but at a cost of scheduling the instructions very badly.

Noise Demo in action

The effects the constant combining pass has on this demo are really interesting, and it actually gives me evidence that some of the ideas I had for the pass are valid, namely that co-issuing instructions is worth a little extra register pressure.

  1. 1.00x FPS of baseline - 3123 instructions - baseline
  2. 1.09x FPS of baseline - 2841 instructions - after promoting constants only if used by more than 2 MADs

Going from no-constant-combining to restricted-constant-combining gives us a 9% increase in frames per second for a 9% instruction count reduction. We're totally limited by fragment shader performance.

  1. 1.46x FPS of baseline - 2841 instructions - after promote any constant used by a MAD

Going from step 2 to 3 though is interesting. The instruction count doesn't change, but we reduced register pressure sufficiently that we can now schedule instructions better without spilling (SCHEDULE_PRE, instead of SCHEDULE_PRE_NON_LIFO) - a 33% speed up just by rearranging instructions.

  1. 1.62x FPS of baseline - 2852 instructions - after promoting constants used by at least 4 co-issueable instructions

I was worried that we weren't going to be able to measure any performance difference from pulling constants out of co-issueable instructions, but we can definitely get a nice improvement here, of about 10% increase in frames per second.

As an aside, I did an experiment to see what would happen if we used SCHEDULE_PRE and spilled registers anyway (I added a couple of extra instructions to increase register pressure over the threshold). I changed the window size to 2048x2048 and rendered a fixed number of frames.

  • SCHEDULE_PRE with no spills: 17.5 seconds
  • SCHEDULE_PRE with 4 spills (8 send instructions): 17.5 seconds
  • SCHEDULE_PRE_NON_LIFO with no spills: 28 seconds

So there's some good evidence that the cure is worse than the disease. Of course this demo doesn't do any texturing, so memory bandwidth is not at a premium.

  1. 1.76x FPS of baseline - 2609 instructions - ???

I ran the demo to see if we'd made any changes in the last two months and was pleasantly surprised to find that we'd cut another 9% of instructions. I have no idea what caused it, but I'll take it! Combined with everything else, we're up to a 76% performance improvement.

Where's the code

The Mesa patches that implement the constant combining pass were committed (commit bb33a31c) and will be in the next major release (presumably version 10.6).

If any of this sounds interesting enough that you'd like to do it for a living, feel free to contact me. My team at Intel is responsible for the open source 3D driver in Mesa and is looking for new talent.

April 02, 2015
Presentation and conferencing

Last week-end, in the Salle des Rancy in Lyon, GNOME folks (Fred Peters, Mathieu Bridon and myself) set up our booth at the top of the stairs, the space graciously offered by Ubuntu-FR and Fedora being a tad bit small. The JdLL were starting.

We gave away a few GNOME 3.14 Live and install DVDs (more on that later), discussed much-loved features, and hated bugs, and how to report them. A very pleasant experience all-in-all.



On Sunday afternoon, I did a small presentation about GNOME's 15 years. Talking about the upheaval, dragging kernel drivers and OS components kicking and screaming to work as their APIs say they should, presenting GNOME 3.16 new features and teasing about upcoming GNOME 3.18 ones.

During the Q&A, we had a few folks more than interested in support for tablets and convertible devices (such as the Microsoft Surface, and Asus T100). Hopefully, we'll be able to make the OS support good enough for people to be able to use any Linux distribution on those.

Sideshow with the Events box

Due to scheduling errors on my part, we ended up with the "v1" events box for our booth. I made a few changes to the box before we used it:

  • Removed the 17" screen, and replaced it with a 21" widescreen one with speakers builtin. This is useful when we can't setup the projector because of the lack of walls.
  • Upgraded machine to 1GB of RAM, thanks to my hoarding of old parts.
  • Bought a French keyboard and removed the German one (with missing keys), cleaned up the UK one (which still uses IR wireless).
  • Threw away GNOME 3.0 CDs (but kept the sleeves that don't mention the minor version). You'll need to take a sharpie to the small print on the back of the sleeve if you don't fill it with an OpenSUSE CD (we used Fedora 21 DVDs during this event).
  • Triaged the batteries. Office managers, get this cheap tester!
  • The machine's Wi-Fi was unstable, causing hardlocks (please test again if you use a newer version of the kernel/distributions). We tried to get onto the conference network through the wireless router, and installed DD-WRT on it as the vendor firmware didn't allow that.
  • The Nokia N810 and N800 tablets will going to kernel developers that are working on Nokia's old Linux devices and upstreaming drivers.
The events box is still in Lyon, until I receive some replacement hardware.

The machine is 7 years-old (nearly 8!) and only had 512MB of RAM, after the 1GB upgrade, the machine was usable, and many people were impressed by the speed of GNOME on a legacy machine like that (probably more so than a brand new one stuttering because of a driver bug, for example).

This makes you wonder what the use for "lightweight" desktop environments is, when a lot of the features are either punted to helpers that GNOME doesn't need or not implemented at all (old CPU and no 3D driver is pretty much the only use case for those).

I'll be putting it in a small SSD into the demo machine, to give it another speed boost. We'll also be needing a new padlock, after an emergency metal saw attack was necessary on Sunday morning. Five different folks tried to open the lock with the code read off my email, to no avail. Did we accidentally change the combination? We'll never know.

New project, ish

For demo machines, especially newly installed ones, you'll need some content to demo applications. This is my first attempt at uniting GNOME's demo content for release notes screenshots, with some additional content that's free to re-distribute. The repository will eventually move to gnome.org, obviously.

Thanks

The new keyboard and mouse, monitor, padlock, and SSD (and my time) were graciously sponsored by Red Hat.
[This is a cross-post from the mail I just sent out to intel-gfx.]

Code of conducts seem to be in the news a bit recently, and I realized that I've never really documented how we run things. It's different from the kernel's overall CodeOfConflict and also differs from the official Intel/OTC one in small details about handling issues. And for completeness there's also the Xorg Foundation event policy. Anyway, I think this is worth clarifying and here it goes.

It's simple: Be respectful, open and excellent to each another.

Which doesn't mean we want to sacrifice quality to be nice. Striving for technical excellence very much doesn't exclude being excellent to someone else, and in our experience it tends to go hand in hand.

Unfortunately things go south occasionally. So if you feel threatened, personally abused or otherwise uncomfortable, even and especially when you didn't participate in a discussion yourself, then please raise this in private with the drm/i915 maintainers (currently Daniel Vetter and Jani Nikula, see MAINTAINERS for contact information). And the "in private" part is important: Humans screw up, disciplining minor fumbles by tarnishing someones google-able track record forever is out of proportion.

Still there are some teeth to this code of conduct:

1. First time around minor issues will be raised in private.

2. On repeat cases a public reply in the discussion will enforce that respectful behavior is expected.

3. We'll ban people who don't get it.

And severe cases will be escalated much quicker.

This applies to all community communication channels (irc, mailing list and bugzilla). And as mentioned this really just is a public clarification of the rules already in place - you can't see that though since we never had to go further than step 1.

Let's keep it at that.

And in case you have a problem with an individual drm/i915 maintainer and don't want to raise it with the other one there's the Xorg BoD, linux foundation TAB and the drm upstream maintainer Dave Airlie.
March 30, 2015

And once again, it’s time for another Limba blogpost :-)limba-small

Limba is a solution to install 3rd-party software on Linux, without interfering with the distribution’s native package manager. It can be useful to try out different software versions, use newer software on a stable OS release or simply to obtain software which does not yet exist for your distribution.

Limba works distribution-independent, so software authors only need to publish their software once for all Linux distributions.

I recently released version 0.4, with which all most important features you would expect from a software manager are complete. This includes installing & removing packages, GPG-signing of packages, package repositories, package updates etc. Using Limba is still a bit rough, but most things work pretty well already.

So, it’s time for another progress report. Since a FAQ-like list is easier to digest. compared to a long blogpost, I go with this again. So, let’s address one important general question first:

How does Limba relate to the GNOME Sandboxing approach?

(If you don’t know about GNOMEs sandboxes, take a look at the GNOME Wiki – Alexander Larsson also blogged about it recently)

First of all: There is no rivalry here and no NIH syndrome involved. Limba and GNOMEs Sandboxes (XdgApp) are different concepts, which both have their place.

The main difference between both projects is the handling of runtimes. A runtime is the shared libraries and other shared ressources applications use. This includes libraries like GTK+/Qt5/SDL/libpulse etc. XdgApp applications have one big runtime they can use, built with OSTree. This runtime is static and will not change, it will only receive critical security updates. A runtime in XdgApp is provided by a vendor like GNOME as a compilation of multiple single libraries.

Limba, on the other hand, generates runtimes on the target system on-the-fly out of several subcomponents with dependency-relations between them. Each component can be updated independently, as long as the dependencies are satisfied. The individual components are intended to be provided by the respective upstream projects.

Both projects have their individual up and downsides: While the static runtime of XdgApp projects makes testing simple, it is also harder to extend and more difficult to update. If something you need is not provided by the mega-runtime, you will have to provide it by yourself (e.g. we will have some applications ship smaller shared libraries with their binaries, as they are not part of the big runtime).

Limba does not have this issue, but instead, with its dynamic runtimes, relies on upstreams behaving nice and not breaking ABIs in security updates, so existing applications continue to be working even with newer software components.

Obviously, I like the Limba approach more, since it is incredibly flexible, and even allows to mimic the behaviour of GNOMEs XdgApp by using absolute dependencies on components.

Do you have an example of a Limba-distributed application?

Yes! I recently created a set of package for Neverball – Alexander Larsson also created a XdgApp bundle for it, and due to the low amount of stuff Neverball depends on, it was a perfect test subject.

One of the main things I want to achieve with Limba is to integrate it well with continuous integration systems, so you can automatically get a Limba package built for your application and have it tested with the current set of dependencies. Also, building packages should be very easy, and as failsafe as possible.

You can find the current Neverball test in the Limba-Neverball repository on Github. All you need (after installing Limba and the build dependencies of all components) is to run the make_all.sh script.

Later, I also want to provide helper tools to automatically build the software in a chroot environment, and to allow building against the exact version depended on in the Limba package.

Creating a Limba package is trivial, it boils down to creating a simple “control” file describing the dependencies of the package, and to write an AppStream metadata file. If you feel adventurous, you can also add automatic build instructions as a YAML file (which uses a subset of the Travis build config schema)

This is the Neverball Limba package, built on Tanglu 3, run on Fedora 21:

Limba-installed Neverball

Which kernel do I need to run Limba?

The Limba build tools run on any Linux version, but to run applications installed with Limba, you need at least Linux 3.18 (for Limba 0.4.2). I plan to bump the minimum version requirement to Linux 4.0+ very soon, since this release contains some improvements in OverlayFS and a few other kernel features I am thinking about making use of.

Linux 3.18 is included in most Linux distributions released in 2015 (and of course any rolling release distribution and Fedora have it).

Building all these little Limba packages and keeping them up-to-date is annoying…

Yes indeed. I expect that we will see some “bigger” Limba packages bundling a few dependencies, but in general this is a pretty annoying property of Limba currently, since there are so few packages available you can reuse. But I plan to address this. Behind the scenes, I am working on a webservice, which will allow developers to upload Limba packages.

This central ressource can then be used by other developers to obtain dependencies. We can also perform some QA on the received packages, map the available software with CVE databases to see if a component is vulnerable and publish that information, etc.

All of this is currently planned, and I can’t say a lot more yet. Stay tuned! (As always: If you want to help, please contact me)

Are the Limba interfaces stable? Can I use it already?

The Limba package format should be stable by now – since Limba is still Alpha software, I will however, make breaking changes in case there is a huge flaw which makes it reasonable to break the IPK package format. I don’t think that this will happen though, as the Limba packages are designed to be easily backward- and forward compatible.

For the Limba repository format, I might make some more changes though (less invasive, but you might need to rebuilt the repository).

tl;dr: Yes! Plase use Limba and report bugs, but keep in mind that Limba is still in an early stage of development, and we need bug reports!

Will there be integration into GNOME-Software and Muon?

From the GNOME-Software side, there were positive signals about that, but some technical obstancles need to be resolved first. I did not yet get in contact with the Muon crew – they are just implementing AppStream, which is a prerequisite for having any support for Limba[1].

Since PackageKit dropped the support for plugins, every software manager needs to implement support for Limba.


So, thanks for reading this (again too long) blogpost :) There are some more exciting things coming soon, especially regarding AppStream on Debian/Ubuntu!

 

[1]: And I should actually help with the AppStream support, but currently I can not allocate enough time to take that additional project as well – this might change in a few weeks. Also, Muon does pretty well already!

March 25, 2015
Did you see?

It will obviously be in Fedora 22 Beta very shortly.

What happened since 3.14? Quite a bit, and a number of unfinished projects will hopefully come to fruition in the coming months.

Hardware support

After quite a bit of back and forth, automatic rotation for tablets will not be included directly in systemd/udev, but instead in a separate D-Bus daemon. The daemon has support for other sensor types, Ambient Light Sensors (ColorHug ALS amongst others) being the first ones. I hope we have compass support soon too.

Support for the Onda v975w's touchscreen and accelerometer are now upstream. Work is on-going for the Wi-Fi driver.

I've started some work on supporting the much hated Adaptive keyboard on the X1 Carbon 2nd generation.

Technical debt

In the last cycle, I've worked on triaging gnome-screensaver, gnome-shell and gdk-pixbuf bugs.

The first got merged into the second, the second got plenty of outdated bugs closed, and priorities re-evaluated as a result.

I wrangled old patches and cleaned up gdk-pixbuf. We still have architectural problems in the library for huge images, but at least we're up to a state where we know what the problems are, not being buried in Bugzilla.

Foundation building

A couple of projects got started that didn't reached maturation yet. I'm pretty happy that we're able to use gnome-books (part of gnome-documents) today to read Comic books. ePub support is coming!



Grilo saw plenty of activity. The oft requested "properties" page in Totem is closer than ever, so is series grouping.

In December, Allan and I met with the ABRT team, and we've landed some changes we discussed there, including a simple "Report bugs" toggle in the Privacy settings, with a link to the OS' privacy policy. The gnome-abrt application had a facelift, but we got somewhat stuck on technical problems, which should get solved in the next cycle. The notifications were also streamlined and simplified.



I'm a fan

Of the new overlay scrollbars, and the new gnome-shell notification handling. And I'm cheering on co-new app in 3.16, GNOME Calendar.

There's plenty more new and interesting stuff in the release, but I would just be duplicating much of the GNOME 3.16 release notes.
March 20, 2015


Le week-end prochain, je vais faire une petite présentation sur les quinze ans de GNOME aux JdLL.

Si les dieux de la livraison sont cléments, GNOME devrait aussi avoir une présence dans le village associatif.
Now that Presentation feedback has finally landed in Weston (feedback, flags), people are starting to pay attention to the output timings as now you can better measure them. I have seen a couple of complaints already that Weston has an extra frame of latency, and this is true. I also have a patch series to fix it that I am going to propose.

To explain how the patch series affects Weston's repaint loop, I made some JSON-timeline recordings before and after, and produced some graphs with Wesgr. Here I will explain how the repaint loop works timing-wise.

Original post Feb 11, 2015.
Update Mar 20, 2015: the patches have landed in Weston.


The old algorithm

The old repaint scheduling algorithm in Weston repaints immediately on receiving the pageflip completion event. This maximizes the time available for the compositor itself to repaint, but it also means that clients can never hit the very next vblank / pageflip.

Figure 1. The old algorithm, the client paints as response to frame callbacks.

Frame callback events are sent at the "post repaint" step. This gives clients almost a full frame's time to draw and send their content before the compositor goes to "begin repaint" again. In Figure 1. you see, that if a client paints extremely fast, the latency to screen is almost two frame periods. The frame latency can never be less than one frame period, because the compositor samples the surface contents (the "repaint flush" point) immediately after the previous vblank.

Figure 2. The old algorithm, the client paints as response to Presentation feedback events.

While frame callback driven clients still get to the full frame rate, the situation is worse if the client painting is driven by presentation_feedback.presented events. The intent is to draw and show a new frame as soon as the old frame was shown. Because Weston starts repaint immediately on the pageflip completion, which is essentially the same time when Presentation feedback is sent, the client cannot hit the repaint of this frame and gets postponed to the next. This is the same two frame latency as with frame callbacks, but here the framerate is halved because the client waits for the frame to be actually shown before continuing, as is evident in Figure 2.

Figure 3. The old algorithm, client posts a frame while the compositor is idle.

Figure 3. shows a less relevant case, where the compositor is idle while a client posts a new frame ("damage commit"). When the compositor is idle graphics-wise (the gray background in the figure), it is not repainting continuously according to the output scanout cycle. To start painting again, Weston waits for an extra vblank first, then repaints, and then the new frame is shown on the next vblank. This is also a 1-2 frame period latency, but it is unrelated to the other two cases, and is not changed by the patches.

The modification to the algorithm

The modification is simple, yet perhaps counter-intuitive at first. We reduce the latency by adding a delay. The "delay before repaint" is in all the figures, and the old algorithm is essentially using a zero delay. The compositor's repaint is delayed so that clients have a chance to post a new frame before the compositor samples the surface contents.

A good amount of delay is a hard question. Too small delay and clients do not have time to act. Too long delay and the compositor itself will be in danger of missing the vblank deadline. I do not know what a good amount is or how to derive it, so I just made it configurable. You can set the repaint window length in milliseconds in weston.ini. The repaint window is the time from starting repaint to the deadline, so the delay is the frame period minus the repaint window. If the repaint window is too long for a frame period, the algorithm will reduce to the old behaviour.

The new algorithm

The following figures are made with a 60 Hz refresh and a 7 millisecond repaint window.

Figure 4. The new algorithm, the client paints as response to frame callback.

When a client paints as response to the frame callback (Figure 4), it still has a whole frame period of time to paint and post the frame. The total latency to screen is a little shorter now, by the length of the delay before compositor's repaint. It is a slight improvement.

Figure 5. The new algorithm, the client paints as response to Presentation feedback.

A significant improvement can be seen in Figure 5. A client that uses the Presentation extension to wait for a frame to be actually shown before painting again is now able to reach the full output frame rate. It just needs to paint and post a new frame during the delay before compositor's repaint. This mode of operation provides the shortest possible latency to screen as the client is able to target the very next vblank. The latency is below one frame period if the deadlines are met.

Discussion

This is a relatively simple change that should reduce display latency, but analyzing how exactly it affects things is not trivial. That is why Wesgr was born.

This change does not really allow clients to wait some additional time before painting to reduce the latency even more, because nothing tells clients when the compositor will repaint exactly. The risk of missing an unknown deadline grows the later a client paints. Would knowing the deadline have practical applications? I'm not sure.

These figures also show the difference between the frame callback and Presentation feedback. When a client's repaint loop is driven by frame callbacks, it maximizes the time available for repainting, which reduces the possibility to miss the deadline. If a client drives its repaint loop by Presentation feedback events, it minimizes the display latency at the cost of increased risk of missing the deadline.

All the above ignores a few things. First, we assume that the time of display is the point of vblank which starts to scan out the new frame. Scanning out a frame actually takes most of the frame period, it's not instantaneous. Going deeper, updating the framebuffer during scanout period instead of vblank could allow reducing latency even more, but the matter becomes complicated and even somewhat subjective. I hear some people prefer tearing to reduce the latency further. Second, we also ignore any fencing issues that might come up in practise. If a client submits a GPU job that takes a long while, there is a good chance it will cause everything to miss a deadline or more.

As usual, this work and most of the development of JSON-timeline and Wesgr were sponsored by Collabora.

PS. Latency and timing issues are nothing new. Owen Taylor has several excellent posts on related subjects in his blog.
March 19, 2015

[Update 19/03/15]: TLDR: kernel patches queued for 4.0 do not require any userspace changes.

Lenovo released a new set of laptops for 2015 with a new (old) feature: the trackpoint device has the physical buttons back. Last year's experiment apparently didn't work out so well.

What do we do in Linux with the last generation's touchpads? The kernel marks them with INPUT_PROP_TOPBUTTONPAD based on the PNPID [1]. In the X.Org synaptics driver and libinput we take that property and emulate top software buttons based on it. That took us a while to get sorted and feed into the myriad of Linux distributions out there but at some point last year the delicate balance of nature was restored and the touchpad-related rage dropped to the usual background noise.

Slow-forward to 2015 and Lenovo introduces the new series. In the absence of unnecessary creativity they are called the X1 Carbon 3rd, T450, T550, X250, W550, L450, etc. Lenovo did away with the un(der)-appreciated top software buttons and re-introduced physical buttons for the trackpoint. Now, that's bound to make everyone happy again. However, as we learned from Agent Smith, happiness is not the default state of humans so Lenovo made sure the harvest is safe.

What we expected to happen was that the trackpoint device has BTN_LEFT, BTN_MIDDLE, BTN_RIGHT and the touchpad has BTN_LEFT and is marked with INPUT_PROP_BUTTONPAD (i.e. it is a Clickpad). That is the case on the x220 generation and the T440 generation. Though the latter doesn't actually have trackpoint buttons and we emulated them in software.

On the X1 Carbon 3rd, the trackpoint has BTN_LEFT, BTN_MIDDLE, BTN_RIGHT but they never send events. The touchpad has BTN_LEFT and BTN_0, BTN_1 and BTN_2 [2]. Clicking the left button on the trackpoint generates BTN_0 on the touchpad device, clicking the right button generates BTN_1 on the touchpad device. So in short, Lenovo has decided to wire the newly re-introduced trackpoint buttons to the touchpad, not the trackpoint. [3] The middle button is currently dead, which is a kernel bug. Meanwhile we think of it as security feature - never accidentally paste your password into your IRC session again!

What does this mean for us? Neither synaptics nor evdev nor libinput currently support this so we've been busy apidae and writing patches like crazy. [Update 19/03/15]: The patches queued in Dmitry's for-linus branch re-route the trackstick buttons in the kernel through the trackstick device. To userspace, the trackstick thus looks the same as in the past and no userspace patches are needed. I've reverted the synaptics patch already, the libinput patch will follow. The patch goes into the kernel and udev.... The two patches needed go into the kernel and udev, and libinput. No, the three patches needed go into the kernel, udev and libinput, and synaptics. The four patches, no, wait. Amongst the projects needing patches are the kernel, udev, libinput and synaptics. I'll try again:

With those put together, things pretty much work as they're supposed to. libinput handles middle button scrolling as well this way but synaptics won't, much for the same reason it didn't work in the previous generation: synaptics can't talk to evdev and vice versa. And given that synaptics is on life support or in pallative care, depending how you look at it, I recommend not holding your breath for a fix. Otherwise you may join it quickly.

Note that all the patches are fresh off the presses and there may be a few bits changing before they are done. If you absolutely can't live without the trackpoint's buttons you can work around it by disabling the synaptics kernel driver until the patches have trickled down to your distribution.

The tracking bug for all this is Bug 88609. Feel free to CC yourself on it. Bring popcorn.

Final note: I haven't seen logs from the T450, T550, ... devices yet yet so this is so far only confirmed on the X1 Carbon so far. Given the hardware is essentially identical I expect it to be true for the rest of the series though.

[1] We also apply quirks for the 2013 generation because the firmware was buggy - a problem Synaptics Inc. has since fixed (but currently gives us slight headaches).
[2] It is also marked with INPUT_PROP_TOPBUTTONPAD which is a bug. It uses a new PNPID but one that was in the range we previously believed was for pads without trackpoint buttons. That's an an easy thing to fix.
[3] The reason for that seems to be HW design: this way they can keep the same case/keyboard and just swap the touchpad bits.
[4] synaptics is old enough to support dedicated scroll buttons. Buttons that used to send BTN_0 and BTN_1 and are thus interpreted as scroll up/down event.

March 16, 2015
So I've still been working on the virgil3d project along with part time help from Marc-Andre and Gerd at Red Hat, and we've been making steady progress. This post is about a test harness I just finished developing for adding and debugging GL features.

So one of the more annoying issuess with working on virgil has been that while working on adding 3D renderer features or trying to track down a piglit failure, you generally have to run a full VM to do so. This adds a long round trip in your test/development cycle.

I'd always had the idea to do some sort of local system renderer, but there are some issues with calling GL from inside a GL driver. So my plan was to have a renderer process which loads the renderer library that qemu loads, and a mesa driver that hooks into the software rasterizer interfaces. So instead of running llvmpipe or softpipe I have a virpipe gallium wrapper, that wraps my virgl driver and the sw state tracker via a new vtest winsys layer for virgl.

So the virgl pipe driver sits on top of the new winsys layer, and the new winsys instead of using the Linux kernel DRM apis just passes the commands over a UNIX socket to a remote server process.

The remote server process then uses EGL and the renderer library, forks a new copy for each incoming connection and dies off when the rendering is done.

The final rendered result has to be read back over the socket, and then the sw winsys is used to putimage the rendering onto the screen.

So this system is probably going to be slower in raw speed terms, but for developing features or debugging fails it should provide an easier route without the overheads of the qemu process. I was pleasantly surprised it only took two days to pull most of this test harness together which was neat, I'd planned much longer for it!

The code lives in two halves.
http://cgit.freedesktop.org/~airlied/virglrenderer
http://cgit.freedesktop.org/~airlied/mesa virgl-mesa-driver

[updated: pushed into the main branches]

Also the virglrenderer repo is standalone now, it also has a bunch of unit tests in it that are run using valgrind also, in an attempt to lock down some more corners of the API and test for possible ways to escape the host.
March 06, 2015

As I said in a previous post, this year I attended FOSDEM to give talk about how to test our OpenGL drivers with Free Software. The slides are available online and the video recording has been recently released.

Last but not least, I would like to thank Igalia for sponsoring my trip :-)

Igalia logo

libinput supports edge scrolling since version 0.7.0. Whoops, how does the post title go with this statement? Well, libinput supports edge scrolling, but only on some devices and chances are your touchpad won't be one of them. Bug 89381 is the reference bug here.

First, what is edge scrolling? As the libinput documentation illustrates, it is scrolling triggered by finger movement within specific regions of the touchpad - the left and bottom edges for vertical and horizontal scrolling, respectively. This is in contrast to two-finger scrolling, triggered by a two-finger movement, anywhere on the touchpad. synaptics had edge scrolling since at least 2002, the earliest commit in the repo. Back then we didn't have multitouch-capable touchpads, these days they're the default and you'd be struggling to find one that doesn't support at least two fingers. But back then edge-scrolling was the default, and touchpads even had the markings for those scroll edges painted on.

libinput adds a whole bunch of features to the touchpad driver, but those features make it hard to support edge scrolling. First, libinput has quite smart software button support. Those buttons are usually on the lowest ~10mm of the touchpad. Depending on finger movement and position libinput will send a right button click, movement will be ignored, etc. You can leave one finger in the button area while using another finger on the touchpad to move the pointer. You can press both left and right areas for a middle click. And so on. On many touchpads the vertical travel/physical resistance is enough to trigger a movement every time you click the button, just by your finger's logical center moving.

libinput also has multi-direction scroll support. Traditionally we only sent one scroll event for vertical/horizontal at a time, even going as far as locking the scroll direction. libinput changes this and only requires a initial threshold to start scrolling, after that the caller will get both horizontal and vertical scroll information. The reason is simple: it's context-dependent when horizontal scrolling should be used, so a global toggle to disable doesn't make sense. And libinput's scroll coordinates are much more fine-grained too, which is particularly useful for natural scrolling where you'd expect the content to move with your fingers.

Finally, libinput has smart palm detection. The large majority of palm touches are along the left and right edges of the touchpad and they're usually indistinguishable from finger presses (same pressure values for example). Without palm detection some laptops are unusable (e.g. the T440 series).

These features interfere heavily with edge scrolling. Software button areas are in the same region as the horizontal scroll area, palm presses are in the same region as the vertical edge scroll area. The lower vertical edge scroll zone overlaps with software buttons - and that's where you would put your finger if you'd want to quickly scroll up in a document (or down, for natural scrolling). To support edge scrolling on those touchpads, we'd need heuristics and timeouts to guess when something is a palm, a software button click, a scroll movement, the start of a scroll movement, etc. The heuristics are unreliable, the timeouts reduce responsiveness in the UI. So our decision was to only provide edge scrolling on touchpads where it is required, i.e. those that cannot support two-finger scrolling, those with physical buttons. All other touchpads provide only two-finger scrolling. And we are focusing on making 2 finger scrolling good enough that you don't need/want to use edge scrolling (pls file bugs for anything broken)

Now, before you get too agitated: if edge scrolling is that important to you, invest the time you would otherwise spend sharpening pitchforks, lighting torches and painting picket signs into developing a model that allows us to do reliable edge scrolling in light of all the above, without breaking software buttons, maintaining palm detection. We'd be happy to consider it.

This feature got merged for libinput 0.8 but I noticed I hadn't blogged about it. So belatedly, here is a short description of scroll sources in libinput.

Scrolling is a fairly simple concept. You move the mouse wheel and the content moves down. Beyond that the details get quite nitty, possibly even gritty. On touchpads, scrolling is emulated through a custom finger movement (e.g. two-finger scrolling). A mouse wheel moves in discrete steps of (usually) 15 degrees, a touchpad's finger movement is continuous (within the device physical resolution). Another scroll method is implemented for the pointing stick: holding the middle button down while moving the stick will generate scroll events. Like touchpad scroll events, these events are continuous. I'll ignore natural scrolling in this post because it just inverts the scroll direction. Kinetic scrolling ("fling scrolling") is a comparatively recent feature: when you lift the finger, the final finger speed determines how long the software will keep emulating scroll events. In synaptics, this is done in the driver and causes all sorts of issues - the driver may keep sending scroll events even while you start typing.

In libinput, there is no kinetic scrolling at all, what we have instead are scroll sources. Currently three sources are defined, wheel, finger and continuous. Wheel is obvious, it provides the physical value in degrees (see this post) and in discrete steps. The "finger" source is more interesting, it is the hint provided by libinput that the scroll event is caused by a finger movement on the device. This means that a) there are no discrete steps and b) libinput guarantees a terminating scroll event when the finger is lifted off the device. This enables the caller to implement kinetic scrolling: simply wait for the terminating event and then calculate the most recent speed. More importantly, because the kinetic scrolling implementation is pushed to the caller (who will push it to the client when the Wayland protocol for this is ready), kinetic scrolling can be implemented on a per-widget basis.

Finally, the third source is "continuous". The only big difference to "finger" is that we can't guarantee that the terminating event is sent, simply because we don't know if it will happen. It depends on the implementation. For the caller this means: if you see a terminating scroll event you can use it as kinetic scroll information, otherwise just treat it normally.

For both the finger and the continuous sources the scroll distance provided by libinput is equivalent to "pixels", i.e. the value that the relative motion of the device would otherwise send. This means the caller can interpret this depending on current context too. Long-term, this should make scrolling a much more precise and pleasant experience than the old X approach of "You've scrolled down by one click".

The API documentation for all this is here: http://wayland.freedesktop.org/libinput/doc/latest/group__event__pointer.html, search for anything with "pointer_axis" in it.

February 28, 2015

Work continues on Typhon. I've recently yearned for a way to study the Monte-level call stacks for profiling feedback. After a bit of work, I think that I've built some things that will help me.

My initial desire was to get the venerable perf to work with Typhon. perf's output is easy to understand, with a little practice, and describes performance problems pretty well.

I'm going to combine this with Brendan Gregg's very cool flame graph system for visually summarizing call stacks, in order to show off how the profiling information is being interpreted. I like flame graphs and they were definitely a goal of this effort.

Maybe perf doesn't need any help. I was able to use it to isolate some Typhon problems last week. I'll use my Monte webserver for all of these tests, putting it under some stress and then looking at the traces and flame graphs.

Now seems like a good time to mention that my dorky little webserver is not production-ready; it is literally just good enough to respond to Firefox, siege, and httperf with a 200 OK and a couple bytes of greeting. This is definitely a microbenchmark.

With that said, let's look at what perf and flame graphs say about webserver performance:

An unhelpful HTTP server profile

You can zoom in on this by clicking. Not that it'll help much. This flame graph has two big problems:

  1. Most of the time is spent in the mysterious "[unknown]" frames. I bet that those are just caused by the JIT's code generation, but perf doesn't know that they're meaningful or how to label them.
  2. The combination of JIT and builtin objects with builtin methods result in totally misleading call stacks, because most object calls don't result in new methods being added to the stack.

I decided to tackle the first problem first, because it seemed easier. Digging a bit, I found a way to generate information on JIT-created code objects and get that information to perf via a temporary file.

The technique is only documented via tribal knowledge and arcane blog entries. (I suppose that, in this regard, I am not helping.) It is described both in this kernel patch implementing the feature, and also in this V8 patch. The Typhon JIT hooks show off my implementation of it.

So, does it work? What does it look like?

I didn't upload a picture of this run, because it doesn't look different from the earlier graph! The big [unknown] frames aren't improved at all. Sufficient digging will reveal the specific newly-annotated frames being nearly never called. Clearly this was not a winning approach.

At this point, I decided to completely change my tack. I wrote a completely new call stack tracer inside Typhon. I wanted to do a sampling profiler, but sampling is hard in RPython. The vmprof project might fix that someday. For now, I'll have to do a precise profiler.

Unlabeled HTTP server profile with correct atoms

I omitted the coding montage. Every time a call is made from within SmallCaps, the profiler takes a time measurement before and after the call. This is pretty great! But can we get more useful names?

Names in Monte are different from names in, say, Python or Java. Python and Java both have class names. Since Monte does not have classes, Monte doesn't have a class name. A compromise which we accept here is to use the "display name" of the class, which will be the pattern used to bind a user-level object literal, and will be the name of the class for all of the runtime object classes. This is acceptable.

HTTP server profile with correct atoms and useful display names

Note how the graphs are differently shaped; all of the frames are being split out properly and the graph is more detailed as a result. The JIT is still active during this entire venture, and it'd be cool to see what the JIT is doing. We can use RPython's rpython.rlib.jit.we_are_jitted() function to mark methods as being JIT'd, and we can ask the flame graph generator to colorize them.

HTTP server profile with JIT COLORS HOLY FUCK I CANNOT BELIEVE IT WORKS

Oh man! This is looking pretty cool. Let's colorize the frames that are able to sit directly below JIT entry points. I do this with a heuristic (regular expression).

THE COLORS NEVER STOP, CAN'T STOP, WON'T STOP

This isn't even close to the kind of precision and detail from the amazing Java-on-Illumos profiles on Gregg's site, but it's more than enough to help my profiling efforts.

February 26, 2015
I recently purchased a 64GB mini SD card to slot in to my laptop and/or tablet, keeping media separate from my home directory pretty full of kernel sources.

This Samsung card looked fast enough, and at 25€ include shipping, seemed good enough value.


Hmm, no mention of the SD card size?

The packaging looked rather bare, and with no mention of the card's size. I opened up the packaging, and looked over the card.

Made in Taiwan?

What made it weirder is that it says "made in Taiwan", rather than "Made in Korea" or "Made in China/PRC". Samsung apparently makes some cards in Taiwan, I've learnt, but I didn't know that before getting suspicious.

After modifying gnome-multiwriter's fake flash checker, I tested the card, and sure enough, it's an 8GB card, with its firmware modified to show up as 67GB (67GB!). The device (identified through the serial number) is apparently well-known in swindler realms.

Buyer beware, do not buy from "carte sd" on Amazon.fr, and always check for fake flash memory using F3 or h2testw, until udisks gets support for this.

Amazon were prompt in reimbursing me, but the Comité national anti-contrefaçon and Samsung were completely uninterested in pursuing this further.

In short:

  • Test the storage hardware you receive
  • Don't buy hardware from Damien Racaud from Chaumont, the person behind the "carte sd" seller account
February 23, 2015

Some years ago I bought myself a new laptop, deleted the windows partition and installed Fedora on the system. Only to later realize that the system had a bug that required a BIOS update to fix and that the only tool for doing such updates was available for Windows only. And while some tools and methods have been available from a subset of vendors, BIOS updates on Linux has always been somewhat of hit and miss situation. Well luckily it seems that we will finally get a proper solution to this problem.
Peter Jones, who is Red Hat’s representative to the UEFI working group and who is working on making sure we got everything needed to support this on Linux, approached me some time ago to let me know of the latest incoming update to the UEFI standard which provides a mechanism for doing BIOS updates. Which means that any system that supports UEFI 2.5 will in theory be one where we can initiate the BIOS update from Linux. So systems supporting this version of the UEFI specs is expected to become available through the course of this year and if you are lucky your hardware vendor might even provide a BIOS update bringing UEFI 2.5 support to your existing hardware, although you would of course need to do that one BIOS update in the old way.

So with Peter’s help we got hold of some prototype hardware from our friends at Intel which already got UEFI 2.5 support. This hardware is currently in the hands of Richard Hughes. Richard will be working on incorporating the use of this functionality into GNOME Software, so that you can do any needed BIOS updates through GNOME Software along with all your other software update needs.

Peter and Richard will as part of this be working to define a specification/guideline for hardware vendors for how they can make their BIOS updates available in a manner we can consume and automatically check for updates. We will try to align ourselves with the requirements from Microsoft in this area to allow the vendors to either use the exact same package for both Windows and Linux or at least only need small changes to them. We can hopefully get this specification up on freedesktop.org for wider consumption once its done.

I am also already speaking with a couple of hardware vendors to see if we can pilot this functionality with them, to both encourage them to support UEFI 2.5 as quickly as possible and also work with them to figure out the finer details of how to make the updates available in a easily consumable fashion.

Our hope here is that you eventually can get almost any hardware and know that if you ever need a BIOS update you can just fire up Software and it will tell you what if any BIOS updates are available for your hardware, and then let you download and install them. For people running Fedora servers we have had some initial discussions about doing BIOS updates through Cockpit, in addition of course to the command line tools that Peter is writing for this.

I mentioned in an earlier blog post that one of our goals with the Fedora Workstation is to drain the swamp in terms of fixing the real issues making using a Linux desktop challenging, well this is another piece of that puzzle and I am really glad we had Peter working with the UEFI standards group to ensure the final specification was useful also for Linux users.

Anyway as soon as I got some data on concrete hardware that will support this I will make sure to let you know.

February 22, 2015
So, I finally figured out the bug that was causing some incorrect rendering in xonotic (and, it turns out, to be the same bug plaguing a lot of other games/webgl/etc).  The fix is pushed to upstream mesa master (and I guess I should probably push it to the 10.5 stable branch too).  Now that xonotic renders correctly, I think I can finally call freedreno a4xx support usable:



Also, for fun, a little comparison between the ifc6540 board (snapdragon 805, aka apq8084), and my laptop (i5-4310U).  Both have 1920x1080 resolution, both running gnome-shell and firefox (with identical settings).  Laptop is fedora f21 while ifc6540 is rawhide), but it is quite close to an apples-to-apples comparision:



Obviously not a rigorous benchmark, so please don't read too much into the results.  The intel is still faster overall (as it should be at it's size/price/power budget), but amazing that the gap is becoming so small between something that can be squeezed into a cell phone and dedicated laptop class chips.

February 21, 2015

Today, I will describe a new way to reverse engineer PCI drivers by creating a PCI passthrough with a QEMU virtual machine. In this article, I will show you how to use the Intel VT-d technology in order to trace memory mapped input/output (MMIO) accesses of a QEMU VM. As a member of Nouveau community, this howto will only be focused on the NVIDIA‘s proprietary driver but it should be pretty similar for all PCI drivers.

Introduction

Reverse engineering the NVIDIA’s proprietary driver is not an easy task, especially on Windows because we have no support for both mmiotrace, a toolbox for tracing memory mapped I/O access within the Linux kernel, and valgrind-mmt which allows tracing application accesses to mmaped memory.

When I started to reverse engineer NVIDIA Perfkit on Windows (for graphics performance counters) in-between the Google Summer of Code 2013 and 2014, I wrote some tools for dumping the configuration of these performance counters, but it was very painful to find multiplexers because I couldn’t really trace MMIO accesses. I would have liked to use Intel VT-d but my old computer didn’t support that recent technology, but recently I got a new computer and my life has changed. ;)

But what is VT-d and how to use it with QEMU ?

An input/output memory management unit (IOMMU) allows guest virtual machines to directly use peripheral devices, such as Ethernet, accelerated graphics cards, through DMA and interrupt remapping. This is called VT-d at Intel and AMD-Vi at AMD.

QEMU allows to use that technology through the VFIO driver which is an IOMMU/device agnostic framework for exposing direct device access to userspace, in a secure, IOMMU protected environment. In other words, this allows safe, non-privileged, userspace drivers. Initially developed by Cisco, VFIO is now maintened by Alex Williamson at Red Hat.

In this howto, I will use Fedora as guest OS but whatever you use it should work for both Linux and Windows OS. Let’s get start.

Tested hardware

Motherboard: ASUS B85 PRO GAMER

CPU: Intel Core i5-4460 3.20GHz

GPU: NVIDIA GeForce 210 (host) and NVIDIA GeForce 9500 GT (guest)

OS: Arch Linux (host) and Fedora 21 (guest)

Prerequisites

Your CPU needs to support both virtualization and IOMMU (Intel VT-d technology, Core i5 at least). You will also need two NVIDIA GPUs and two monitors, or one with two different inputs (one plugged into your host GPU, one into your guest GPU). I would also recommend you to have a separate keyboard and mouse for the guest OS.

Step 1: Hardware setup

Check if your CPU supports virtualization.

egrep -i '^flags.*(svm|vmx)' /proc/cpuinfo

If so, enable CPU virtualization support and Intel VT-d from the BIOS.

Step 2: Kernel config

1) Modify kernel config
Device Drivers --->
    [*] IOMMU Hardware Support  --->
        [*]   Support for Intel IOMMU using DMA Remapping Devices
        [*]   Support for Interrupt Remapping
Device Drivers --->
    [*] VFIO Non-Privileged userspace driver framework  --->
        [*]   VFIO PCI support for VGA devices
Bus options (PCI etc.) --->
    [*] PCI Stub driver
2) Build kernel
3) Reboot, and check if your system has support for both IOMMU and DMA remapping
dmesg | grep -e IOMMU -e DMAR
[    0.000000] ACPI: DMAR 0x00000000BD9373C0 000080 (v01 INTEL  HSW      00000001 INTL 00000001)
[    0.019360] dmar: IOMMU 0: reg_base_addr fed90000 ver 1:0 cap d2008c20660462 ecap f010da
[    0.019362] IOAPIC id 8 under DRHD base  0xfed90000 IOMMU 0
[    0.292166] DMAR: No ATSR found
[    0.292235] IOMMU: dmar0 using Queued invalidation
[    0.292237] IOMMU: Setting RMRR:
[    0.292246] IOMMU: Setting identity map for device 0000:00:14.0 [0xbd8a6000 - 0xbd8b2fff]
[    0.292269] IOMMU: Setting identity map for device 0000:00:1a.0 [0xbd8a6000 - 0xbd8b2fff]
[    0.292288] IOMMU: Setting identity map for device 0000:00:1d.0 [0xbd8a6000 - 0xbd8b2fff]
[    0.292301] IOMMU: Prepare 0-16MiB unity mapping for LPC
[    0.292307] IOMMU: Setting identity map for device 0000:00:1f.0 [0x0 - 0xffffff]

!!! If you have no output, you have to fix this before continuing !!!

Step 3: Build QEMU

git clone git://git.qemu-project.org/qemu.git --depth 1
cd qemu
./configure --python=/usr/bin/python2 # Python 3 is not yet supported
make && make install

You can also install QEMU from your favorite package manager, but I would recommend you to get the source code if you want to enable VFIO tracing support.

Step 4: Unbind the GPU with pci-stub

According to my hardware config, I have two NVIDIA GPUs, so blacklisting the Nouveau kernel module is not so good. Instead, I will use pci-stub in order to unbind the GPU which will be assigned to the guest OS.

NOTE: If pci-stub was built as a module, you’ll need to modify /etc/mkinitcpio.conf, add pci-stub in the MODULES section, and update your initramfs.

lspci
01:00.0 VGA compatible controller: NVIDIA Corporation GT218 [GeForce 210] (rev a2)
05:00.0 VGA compatible controller: NVIDIA Corporation G96 [GeForce 9500 GT] (rev a1)
lspci -n
01:00.0 0300: 10de:0a65 (rev a2) # GT218
05:00.0 0300: 10de:0640 (rev a1) # G96

Now add the following kernel parameter to your bootloader.

pci-stub.ids=10de:0640

Reboot, and check.

dmesg | grep pci-stub
[    0.000000] Command line: BOOT_IMAGE=/vmlinuz-nouveau root=UUID=5f64607c-5c72-4f65-9960-d5c7a981059e rw quiet pci-stub.ids=10de:0640
[    0.000000] Kernel command line: BOOT_IMAGE=/vmlinuz-nouveau root=UUID=5f64607c-5c72-4f65-9960-d5c7a981059e rw quiet pci-stub.ids=10de:0640
[    0.295763] pci-stub: add 10DE:0640 sub=FFFFFFFF:FFFFFFFF cls=00000000/00000000
[    0.295768] pci-stub 0000:05:00.0: claimed by stub

Step 5: Bind the GPU with VFIO

Now, it’s time to bind the GPU (the G96 card in this example) with VFIO in order to pass through it to the VM. You can use this script to make life easier:

#!/bin/bash

modprobe vfio-pci

for dev in "$@"; do
        vendor=$(cat /sys/bus/pci/devices/$dev/vendor)
        device=$(cat /sys/bus/pci/devices/$dev/device)
        if [ -e /sys/bus/pci/devices/$dev/driver ]; then
                echo $dev > /sys/bus/pci/devices/$dev/driver/unbind
        fi
        echo $vendor $device > /sys/bus/pci/drivers/vfio-pci/new_id
done

Bind the GPU:

./vfio-bind.sh 0000:05:00.0 # G96

Step 6: Testing KVM VGA-Passthrough

Let’s test if it works, as root:

qemu-system-x86_64 \
    -enable-kvm \
    -M q35 \
    -m 2G \
    -cpu host, kvm=off \
    -device vfio-pci,host=05:00.0,multifunction=on,x-vga=on

If it works fine, you should see a black QEMU window with the message “Guest has not initialized the display (yet)”. You will need to pass -vga none, otherwise it won’t work. I’ll show you all the options I use a bit later.

NOTE: kvm=off is required for some recent NVIDIA proprietary drivers because it won’t be loaded if it detects KVM…

Step 7: Add USB support

At this step, we have assigned the GPU to the virtual machine, but it would be a good idea to be able to use that guest OS with a keyboard, for example. To do this, we need to add USB support to the VM. The preferred way is to pass through an entire USB controller like we already did for the GPU.

lspci | grep USB
00:14.0 USB controller: Intel Corporation 8 Series/C220 Series Chipset Family USB xHCI (rev 05)
00:1a.0 USB controller: Intel Corporation 8 Series/C220 Series Chipset Family USB EHCI #2 (rev 05)
00:1d.0 USB controller: Intel Corporation 8 Series/C220 Series Chipset Family USB EHCI #1 (rev 05)

Add the following line to QEMU, example for 00:14.0:

-device vfio-pci,host=00:14.0,bus=pcie.0

Before trying USB support inside the VM, you need to assign that USB controller to VFIO, but you will lose your keyboard and your mouse from the host in case they are connected to that controller.

./vfio-bind.sh 0000:00:14.0

In order to re-enable the USB support from the host, you will need to unbind the controller, and to bind it to xhci_hcd.

echo 0000:00:14.0 > /sys/bus/pci/drivers/vfio-pci/unbind
echo 0000:00:14.0 > /sys/bus/pci/drivers/xhci_hcd/bind

If you get an error with USB support, you might simply try a different controller, or try to assign USB devices by ID.

Step 8: Install guest OS

Now, it’s time to install the guest OS. I installed Fedora 21 because it’s just not possible to run Arch Linux inside QEMU due to a bug in syslinux… Whatever, install your favorite Linux OS and go ahead. I would also recommend to install envytools (a collection of tools developed by the members of the Nouveau community) in order to easily test the tracing support.

You can use the script below to launch a VM with VGA and USB passthrough, and all the stuff we need.

#!/bin/bash

modprobe vfio-pci

vfio_bind()
{
    dev="$1"
        vendor=$(cat /sys/bus/pci/devices/$dev/vendor)
        device=$(cat /sys/bus/pci/devices/$dev/device)
        if [ -e /sys/bus/pci/devices/$dev/driver ]; then
                echo $dev > /sys/bus/pci/devices/$dev/driver/unbind
        fi
        echo $vendor $device > /sys/bus/pci/drivers/vfio-pci/new_id
}

# Bind devices.
modprobe vfio-pci
vfio_bind 0000:05:00.0  # GPU (NVIDIA G96)
vfio_bind 0000:00:14.0  # USB controller

qemu-system-x86_64 \
    -enable-kvm \
    -M q35 \
    -m 2G \
    -hda fedora.img \
    -boot d \
    -cpu host,kvm=off \
    -vga none \
    -device vfio-pci,host=05:00.0,multifunction=on,x-vga=on \
    -device vfio-pci,host=00:14.0,bus=pcie.0


# Restore USB controller
echo 0000:00:14.0 > /sys/bus/pci/drivers/vfio-pci/unbind
echo 0000:00:14.0 > /sys/bus/pci/drivers/xhci_hcd/bind

Step 9: Enable VFIO tracing support for QEMU

1) Configure QEMU to enable tracing

Enable the stderr trace backend. Please refer to docs/tracing.txt if you want to change the backend.

./configure --python=/usr/bin/python2 --enable-trace-backends=stderr
2) Disable MMAP support

Disabling MMAP support uses the slower read/write accesses to MMIO space that will get traced. To do this, open the file include/hw/vfio/vfio-common.h, and change #define VFIO_ALLOW_MMAP from 1 to 0.

 /* Extra debugging, trap acceleration paths for more logging */
-#define VFIO_ALLOW_MMAP 1
+#define VFIO_ALLOW_MMAP 0

Re-build QEMU.

3) Add the trace points you want to observe

Create a events.txt file and add the vfio_region_write trace point which dumps MMIO read/write accesses of the GPU.

echo "vfio_region_write" > events.txt

VFIO tracing support is now enabled and configured, really easy, huh?

Thanks to Alex Williamson for these hints.

Step 10: Trace MMIO write accesses

Let’s now test VFIO tracing support. Enable events tracing by adding the following line to the script which launchs the VM.

-trace events=events.txt

Launch the VM. You should see lot of traces from the standard error output, this is a good news.

Open a terminal in the VM, go to the directory where envytools has been built, and run (as root) the following command.

./nvahammer 0xa404 0xdeadbeef

This command writes a 32-bit value (0xdeadbeef) to the MMIO register at 0xa404 and repeats the write in an infinite loop. It needs to be manually aborted.

Go back to the host, and you should see the following traces if it works fine.

12347@1424299207.289770:vfio_region_write  (0000:05:00.0:region0+0xa404, 0xdeadbeef, 4)
12347@1424299207.289774:vfio_region_write  (0000:05:00.0:region0+0xa404, 0xdeadbeef, 4)
12347@1424299207.289778:vfio_region_write  (0000:05:00.0:region0+0xa404, 0xdeadbeef, 4)

In this example, we have only traced MMIO write accesses, but of course, if you want to trace read accesses, you just have to change vfio_region_write to vfio_region_read.

Congratulations!

In this article I showed you how to trace MMIO accesses using a PCI passthrough with QEMU, Intel VT-d and VFIO. However, all PCI accesses are currently traced including USB controller and this is not ideal unlike mmiotrace which only dumps accesses for one peripheral. It would be also a good idea to have the same format as mmiotrace in order to use the decoding tools we already have for it in envytools.

Future work

– do not trace all PCI accesses (device and subrange address filtering)

– VFIO traces to the mmiotrace format

– compare performance when tracing support is enabled or not

Related ressources

KVM VGA-Passthrough on ArchLinux

VGA-Passthrough on Debian

VFIO documentation

QEMU VFIO tracing documentation


February 18, 2015

FOSDEM 2015

It was another great FOSDEM this year. Even with their 5-10.000 attendants, the formula of being absolutely free, with limited sponsorship, and while only making small changes each year is an absolute winner. There is just no conference which comes even close to FOSDEM.

For those on ICE14 on Friday, the highspeed train from Frankfurt to Brussels south at 14:00, who were so nice to participate in my ad-hoc visitor count: 66. I counted 66 people, but i might have skipped a few as people were sometimes too engrossed in their laptops to hear me over their headphones. On a ~400 seat train, that's a pretty high number, and i never see the same level of geekiness on the Frankfurt to Brussels trains as on the Friday before FOSDEM. If it didn't sound like an organizational nightmare, it might have been a good idea to talk to DB and get a whole carriage reserved especially for FOSDEM goers.

With the Graphics DevRoom we returned to the K building this year, and i absolutely love the cozy 80 people classrooms we have there. With good airflow, freely movable tables, and an easy way to put in my powersockets, i have an easy time as a devroom organizer. While some speakers probably prefer bigger rooms to have a higher number of attendants, there is nothing like the more direct interaction of the rooms in the K buildings. With us sitting on the top floor, we also only had people who were very actively interested in the topics of the graphics devroom.

Despite the fact that FOSDEM has no equal, and anyone who does anything with open source software in the European Union should attend, i always have a very difficult time recruiting a full set of speakers for the graphics DevRoom. Perhaps the biggest reason for this is the fact that it is a free conference, and it lacks the elitarian status of a paid-for conference. Everyone can attend your talk, even people who do not work on the kernel or on graphics drivers, and the potential speaker might feel as if he needs to waste his time on people who are below his own perceived station. Another reason may be that it is harder to convince the beancounters to sponsor a visit to a free conference. In that case, if you live in the European Union and when you are paid to do open source software, you should be able to afford going to the must-visit FOSDEM by yourself.

As for next year, i am not sure whether there will be a graphics devroom again. Speaker count really was too low and perhaps it is time for another hiatus. Perhaps it is once again time to show people that talking in a devroom at FOSDEM truly is a privilege, and not to be taken for granted.

Tamil "Driver" talk.

My talk this year, or rather, my incoherent mumble finished off with a demo, was about showing my work on the ARM Mali Midgard GPUs. For those who had to endure it, my apologies for my ill-preparedness; i poured all my efforts into the demo (which was finished on Wednesday), and spent too much time doing devroom stuff (which ate Thursday) and of course in drinking up, ahem, the event that is FOSDEM (which ate the rest of the weekend). I will try to make up for it now in this blog post.

Current Tamil Status.

As some might remember, in September and October 2013, i did some preliminary work on the Mali T-series. I spent about 3 to 3.5 weeks building the infrastructure to capture the command stream and replay it. At the same time I also dug deep into the Mali binary driver to expose the binary shader compiler. These two feats gave me all the prerequisites for doing the job of reverse engineering the command stream of the Mali Midgard.

During the Christmas holidays of 2014 and in January 2015, i spent my time building a command stream parser. This is a huge improvement over my work on lima, where i only ended doing so later on in the process (while bringing up Q3A). As i built up the capabilities of this parser, i threw ever more complex GLES2 programs at it. A week before FOSDEM, my parser was correctly handling multiple draws and frames, uniforms, attributes, varyings and textures. Instead of having raw blobs of memory, i had C structs and tables, allowing me to easily see the differences between streams.

I then took the parsed result of my most complex test and slowly turned that into actual C code, using the shader binary produced by the binary compiler, and adding a trivial sequential memory allocator. I then added rotation into the mix, and this is the demo as seen on FOSDEM (and now uploaded to youtube).

All the big things are known.

For textures. I only have simple texture support at this time, no mipmapping nor cubemapping yet, and only RGB565 and RGBA32 are supported at this time. I also still have not figured out how to disable swizzling, instead i re-use the texture swizzling code from lima, the only place where I was able to re-use code in tamil. This missing knowledge is just some busywork, and a bit of coding away.

As for programs, while both the Mali Utgard (M-series) and Midgard (T-series) binary compilers output in a format called MBS (Mali Binary Shader), the contents of each file is significantly different. I had no option but to rewrite the MBS parser for tamil.

Instead of rewriting the vertex shaders binaries like ARMs binary driver does, i reorder the entries in the attribute descriptor table to match the order as described by the shader compiler output. This avoids adding a whole lot of logic to handle this correctly, even though MBS now describes which bits to alter in the binary. I still lay uniforms, attributes and varyings out by hand though, i similarly have only limited knowledge of typing at this point. This mostly is a bit of busywork of writing up the actual logic, and trying out a few different things.

I know only very few things about most of the GL state. Again, mostly busywork with a bit of testing and coding up the results. And while many values left and right are still magic, nothing big is hiding any more.

Unlike lima, i am refraining from building up more infrastructure (than necessary to show the demo) outside of Mesa. The next step really is writing up a mesa driver. Since my lima driver for mesa was already pretty advanced, i should be able to re-use a lot of the knowledge gained there, and perhaps some code.

The demo

The demo was shown on a Samsung ARM Chromebook, with a kernel and linux installation from september 2013 (when i brought up cs capture and exposed the shader compiler). The exynos KMS driver on this 3.4.0 kernel is terrible. It only accepts a few fixed resolutions (as if I never existed and modesetting wasn't a thing), and then panics when you even look at it. Try to leave X while using HDMI: panic. Try to use a KMS plane to display the resulting render: panic.

In the end, i used fbdev and memcpy the rendered frame over to the console. On this kernel, i cannot even disable the console, so some of the visible flashing is the console either being overwritten by or overwriting the copied render.

The youtube video shows a capture of the Chromebooks built in LCD, at 1280x720 on a 1366x768 display. At FOSDEM, i was lucky that the projector accepted the 1280x720 mode the exynos hdmi driver produced. My dumb HDMI->VGA converter (which makes the image darker) was willing to pass this through directly. I have a more intelligent HDMI->VGA adapter which also does scaling and which keeps colours nice, but that one just refused the output of the exynos driver. The video that was captured in our devroom probably does not show the demo correctly, as that should've been at 1024x768.

The demo shows 3 cubes rotating in front of the milky way. It is 4 different draws, using 3 different textures, and 3 different programs. These cubes currently rotate at 47fps, with the memcpy. During the talk, the chromebook slowed down progressively down to 26fps and even 21fps at one point, but i have not seen that behaviour before or since. I do know of an issue that makes the demo fail at frame 79530, which is 100% reproducible. I still need to track this down, it probably is an issue with my job handling code.

linux-exynos.org

With Lima and Tamil i am in a very unique position. Unlike on adreno, tegra or Videocore, i have to deal with many different SoCs. Apart from the difference in kernel GPU drivers, i also have to deal with differences in display drivers and run into a whole new world of hurts every time i move over to a new target device. The information for doing a proper linux installation on an android or chrome device is usually dispersed, not up to date, and not too good, and i get to do a lot of the legwork for myself every time, knowing full well that a lot of others have done so already but couldn't be bothered to document things (hence my role in the linux-sunxi community).

The ARM chromebook and its broken kms driver is much of the same. Last FOSDEM i complained how badly supported and documented the Samsung ARM chromebook is, despite its popularity, and appealed for more linux-sunxi style, SoC specific communities, especially since I, as a graphics driver developer, cannot go and spend as much time in each and every of the SoC projects as i have done with sunxi.

During the questions round of this talk, one guy in the audience asked what needed to be done to fix the SoC pain. At first i completely missed the question, upon which he went and rephrased his question. My answer was: provide the infrastructure, make some honest noise and people will come. Usually, when some asks such a question, nothing ever comes from it. But Merlijn "Wizzup" Wajer and his buddy S.J.R. "Swabbles" van Schaik really came through.

Today there is the linux-exynos.org wiki, the linux-exynos mailinglist, and the #linux-exynos irc channel. While the community is not as large as linux-sunxi, it is steadily growing. So if you own exynos based hardware, or if your company is basing a product on the exynos chipset, head to linux-exynos.org and help these guys out. Linux-exynos deserves your support.
February 16, 2015

A few months ago, I wrote the definitive guide about Python method declaration, which had quite a good success. I still fight every day in OpenStack to have the developers declare their methods correctly in the patches they submit.

Automation plan

The thing is, I really dislike doing the same things over and over again. Furthermore, I'm not perfect either, and I miss a lot of these kind of problems in the reviews I made. So I decided to replace me by a program – a more scalable and less error-prone version of my brain.

In OpenStack, we rely on flake8 to do static analysis of our Python code in order to spot common programming mistakes.

But we are really pedantic, so we wrote some extra hacking rules that we enforce on our code. To that end, we wrote a flake8 extension called hacking. I really like these rules, I even recommend to apply them in your own project. Though I might be biased or victim of Stockholm syndrome. Your call.

Anyway, it's pretty clear that I need to add a check for method declaration in hacking. Let's write a flake8 extension!

Typical error

The typical error I spot is the following:

class Foo(object):
# self is not used, the method does not need
# to be bound, it should be declared static
def bar(self, a, b, c):
return a + b - c


That would be the correct version:

class Foo(object):
@staticmethod
def bar(a, b, c):
return a + b - c


This kind of mistake is not a show-stopper. It's just not optimized. Why you have to manually declare static or class methods might be a language issue, but I don't want to debate about Python misfeatures or design flaws.

Strategy

We could probably use some big magical regular expression to catch this problem. flake8 is based on the pep8 tool, which can do a line by line analysis of the code. But this method would make it very hard and error prone to detect this pattern.

Though it's also possible to do an AST based analysis on on a per-file basis with pep8. So that's the method I pick as it's the most solid.

AST analysis

I won't dive deeply into Python AST and how it works. You can find plenty of sources on the Internet, and I even talk about it a bit in my book The Hacker's Guide to Python.

To check correctly if all the methods in a Python file are correctly declared, we need to do the following:

  • Iterate over all the statement node of the AST
  • Check that the statement is a class definition (ast.ClassDef)
  • Iterate over all the function definitions (ast.FunctionDef) of that class statement to check if it is already declared with @staticmethod or not
  • If the method is not declared static, we need to check if the first argument (self) is used somewhere in the method

Flake8 plugin

In order to register a new plugin in flake8 via hacking, we just need to add an entry in setup.cfg:

[entry_points]
flake8.extension =
[…]
H904 = hacking.checks.other:StaticmethodChecker
H905 = hacking.checks.other:StaticmethodChecker


We register 2 hacking codes here. As you will notice later, we are actually going to add an extra check in our code for the same price. Stay tuned.

The next step is to write the actual plugin. Since we are using an AST based check, the plugin needs to be a class following a certain signature:

@core.flake8ext
class StaticmethodChecker(object):
def __init__(self, tree, filename):
self.tree = tree
 
def run(self):
pass


So far, so good and pretty easy. We store the tree locally, then we just need to use it in run() and yield the problem we discover following pep8 expected signature, which is a tuple of (lineno, col_offset, error_string, code).

This AST is made for walking ♪ ♬ ♩

The ast module provides the walk function, that allow to iterate easily on a tree. We'll use that to run through the AST. First, let's write a loop that ignores the statement that are not class definition.

@core.flake8ext
class StaticmethodChecker(object):
def __init__(self, tree, filename):
self.tree = tree
 
def run(self):
for stmt in ast.walk(self.tree):
# Ignore non-class
if not isinstance(stmt, ast.ClassDef):
continue


We still don't check for anything, but we know how to ignore statement that are not class definitions. The next step need to be to ignore what is not function definition. We just iterate over the attributes of the class definition.

for stmt in ast.walk(self.tree):
# Ignore non-class
if not isinstance(stmt, ast.ClassDef):
continue
# If it's a class, iterate over its body member to find methods
for body_item in stmt.body:
# Not a method, skip
if not isinstance(body_item, ast.FunctionDef):
continue


We're all set for checking the method, which is body_item. First, we need to check if it's already declared as static. If so, we don't have to do any further check and we can bail out.

for stmt in ast.walk(self.tree):
# Ignore non-class
if not isinstance(stmt, ast.ClassDef):
continue
# If it's a class, iterate over its body member to find methods
for body_item in stmt.body:
# Not a method, skip
if not isinstance(body_item, ast.FunctionDef):
continue
# Check that it has a decorator
for decorator in body_item.decorator_list:
if (isinstance(decorator, ast.Name)
and decorator.id == 'staticmethod'):
# It's a static function, it's OK
break
else:
# Function is not static, we do nothing for now
pass


Note that we use the special for/else form of Python, where the else is evaluated unless we used break to exit the for loop.

for stmt in ast.walk(self.tree):
# Ignore non-class
if not isinstance(stmt, ast.ClassDef):
continue
# If it's a class, iterate over its body member to find methods
for body_item in stmt.body:
# Not a method, skip
if not isinstance(body_item, ast.FunctionDef):
continue
# Check that it has a decorator
for decorator in body_item.decorator_list:
if (isinstance(decorator, ast.Name)
and decorator.id == 'staticmethod'):
# It's a static function, it's OK
break
else:
try:
first_arg = body_item.args.args[0]
except IndexError:
yield (
body_item.lineno,
body_item.col_offset,
"H905: method misses first argument",
"H905",
)
# Check next method
continue


We finally added some check! We grab the first argument from the method signature. Unless it fails, and in that case, we know there's a problem: you can't have a bound method without the self argument, therefore we raise the H905 code to signal a method that misses its first argument.

Now you know why we registered this second pep8 code along with H904 in setup.cfg. We have here a good opportunity to kill two birds with one stone.

The next step is to check if that first argument is used in the code of the method.

for stmt in ast.walk(self.tree):
# Ignore non-class
if not isinstance(stmt, ast.ClassDef):
continue
# If it's a class, iterate over its body member to find methods
for body_item in stmt.body:
# Not a method, skip
if not isinstance(body_item, ast.FunctionDef):
continue
# Check that it has a decorator
for decorator in body_item.decorator_list:
if (isinstance(decorator, ast.Name)
and decorator.id == 'staticmethod'):
# It's a static function, it's OK
break
else:
try:
first_arg = body_item.args.args[0]
except IndexError:
yield (
body_item.lineno,
body_item.col_offset,
"H905: method misses first argument",
"H905",
)
# Check next method
continue
for func_stmt in ast.walk(body_item):
if six.PY3:
if (isinstance(func_stmt, ast.Name)
and first_arg.arg == func_stmt.id):
# The first argument is used, it's OK
break
else:
if (func_stmt != first_arg
and isinstance(func_stmt, ast.Name)
and func_stmt.id == first_arg.id):
# The first argument is used, it's OK
break
else:
yield (
body_item.lineno,
body_item.col_offset,
"H904: method should be declared static",
"H904",
)


To that end, we iterate using ast.walk again and we look for the use of the same variable named (usually self, but if could be anything, like cls for @classmethod) in the body of the function. If not found, we finally yield the H904 error code. Otherwise, we're good.

Conclusion

I've submitted this patch to hacking, and, finger crossed, it might be merged one day. If it's not I'll create a new Python package with that check for flake8. The actual submitted code is a bit more complex to take into account the use of abc module and include some tests.

As you may have notice, the code walks over the module AST definition several times. There might be a couple of optimization to browse the AST in only one pass, but I'm not sure it's worth it considering the actual usage of the tool. I'll let that as an exercise for the reader interested in contributing to OpenStack. 😉

Happy hacking!

February 14, 2015

Two years of having a real job have made me bitter and grumpy, so I'm gonna stop with the cutesy blog post titles.

Today, I'm gonna talk a bit about Monte. Specifically, I'm going to talk about Typhon's current virtual machine. Typhon used to use an abstract syntax tree interpreter ("AST interpreter") as its virtual machine. It now uses a variant of the SmallCaps bytecode machine. I'll explain how and why.

When I started designing Typhon, I opted for an AST VM because it seemed like it matched Monte's semantics well. Monte is an expression language, which means that every syntactic node is an expression which can be evaluated to a single value. Monte is side-effecting, so evaluation must happen in an environment which can record the side effects. An AST interpreter could have an object representing each syntactic node, and each node could be evaluated within an environment to produce a value.

class Node(object):

    def evaluate(self, environment):
        pass

Subclasses of Node can override evaluate() to have behavior specific to that node. The Assign class can assign values to names in the environment, the Object class can create new objects, and the Call class can pass messages to objects. This technique is amenable to RPython, since Node.evaluate() has the same signature for all nodes.

How well does this work with the RPython just-in-time ("JIT") compiler? It does alright. The JIT has little problem seeing that the nodes are compile-time constant, and promotion applied strategically allows the JIT to see into things like Call nodes, which cause methods to be inlined relatively well. However, there are some problems.

The first problem is that Monte has no syntactic loops. Since RPython's JIT is designed to compile loops, this means that some extra footwork is needed to set up the JIT. I took Monte's looping object, __loop, and extended it to detect when its argument is likely to yield a JIT trace. The detection is simple, examining whether the loop body is a user-defined object and only entering the JIT when that is the case. In theory, most bodies will be user-defined, since Monte turns this:

object SubList:
    to coerce(specimen, ej):
        def [] + l exit ej := specimen
        for element in l:
            subguard.coerce(element, ej)
        return specimen

‌into this:

object SubList:
    method coerce(specimen, ej):
        def via (__splitList.run([0])) [l] exit ej := specimen
        var validFlag__6 := true
        try:
            __loop.run([l, object _ {
                method run(_, value__8) {
                    __validateFor.run([validFlag__6])
                    subguard.coerce([value__8, ej])
                    null
                }
            }])
        finally:
            validFlag__6 := false
        specimen

This example is from Typhon's prelude. The expansion of loops, and other syntax, is defined to turn the full Monte language into a smaller language, Kernel-Monte, which resembles Kernel-E. Now, in this example, the nonce loop object generated by the expander is very custom, and probably couldn't be simplified any further. However, it is theoretically possible that an optimizer could detect loops that call a single method repeatedly and simplify them more aggressively. In practice, that optimization doesn't exist, so Typhon thinks that all loops are user-defined and allows the JIT to trace all of them.

The next hurdle has to do with names. Monte's namespaces are static, which means that it's possible to always know in advance which names are in a scope. This is a powerful property for compilers, since it opens up many kinds of compilation and lowering which aren't available to languages with dynamic namespaces like Python. However, Typhon doesn't have access to the scoping information from the expander, only the AST. This means that Typhon has to redo most of the name and scope analysis in order to figure out things like how big namespaces should be and where to store everything. I initially did all of this at runtime, in the JIT, but it is very slow. This is because the JIT has problems seeing into dictionaries, and it cannot trust that a dictionary is actually constant-size or constant-keyed.

RPython does have an answer to this, called virtual and virtualizable objects. A virtual object is one that is never constructed within a JIT trace, because the JIT knows that the object won't leave the trace and can be totally elided. (The literature talks at length of "escape analysis", the formal term for this.) Typhon's AST nodes occasionally generated virtual objects, but only rarely, because most objects are assigned to names in the environment, and the JIT refuses to ignore modifications to the environment.

Virtualizable objects solve this problem neatly. A virtualizable object has one or more attributes, called virtualizable attributes, which can be "out-of-sync" during a JIT trace. A virtualizable can delay being updated, as long as it's updated at some point before the end of the JIT trace. RPython allows fields of integers and floating-point values to be virtualizable, as well as constant-size lists. However, Typhon uses mappings of names for its environments, backed by dictionaries, and dictionaries can't be virtualizable.

The traditional solution to this problem involves assigning an index to every name within a scope, and using a constant-size list with the indices. I did this, but it was arduous and didn't have the big payoff that I had wanted. Why not? Well, every AST node introduces a new scope! This caused scope creation to be expensive. I realized that I needed to compute closures as well.

Around this time, a few months ago, I began to despair because debugging the AST VM is hard. RPython's JIT logging and tooling is all based around the assumption that there is a low-level virtual machine of some sort which has instructions and encapsulation, and the AST was just too hard to manage in this way. I had had to invent my own tracebacks, my own logging, and my own log viewer. This wasn't going well. I wanted to join the stack-based VM crew and not have piles of ASTs to slog through whenever something wasn't going right. So, I decided to try to implement SmallCaps, from E. E, of course, is the inspiration for Monte, and shares many features with Monte. SmallCaps was based on old Smalltalk systems, but was designed to work with unique E features like ejectors.

So, enough talk, time for some code. First, let's lay down some ground rules. These are the guiding semantics of SmallCaps in Typhon. Keep in mind that we are describing a single-stack automaton with a side stack for exception handling and an environment with frames for local values and closed-over "global" values.

  • All expressions return a value. Therefore, an expression should always compile to some instructions which start with an empty stack and leave a single value on the stack.
  • All patterns perform some side effects in the environment and return nothing. Therefore, they should compile to instructions which consume two values from the stack (specimen and ejector) and leave nothing.
  • When an exception handler is required, every handler must be dropped when it's no longer needed.

With these rules, the compiler's methods become very obvious.

class Str(Node):
    """
    A literal string.
    """

    def compile(self, compiler):
        index = compiler.literal(StrObject(self._s))

Strings are compiled into a single LITERAL instruction that places a string on the stack. Simple enough.

class Sequence(Node):
    """
    A sequence of nodes.
    """

    def compile(self, compiler):
        for node in self._l[:-1]:
            node.compile(compiler)
            compiler.addInstruction("POP", 0)
        self._l[-1].compile(compiler)

Here we compile sequences of nodes by compiling each node in the sequence, and then using POP to remove each intermediate node's result, since they aren't used. This nicely mirrors the semantics of sequences, which are to evaluate every node in the sequence and then return the value of the ultimate node's evaluation.

This also shows off a variant on the Visitor pattern which Allen, Mike, and I are calling the "Tourist pattern", where an accumulator is passed from node to node in the structure and recursion is directed by each node. This makes managing the Expression Problem much easier, since nodes completely contain all of the logic for each accumulation, and makes certain transformations much easier. More on that in a future post.

class FinalPattern(Pattern):

    def compile(self, compiler):
        # [specimen ej]
        if self._g is None:
            compiler.addInstruction("POP", 0)
            # [specimen]
        else:
            self._g.compile(compiler)
            # [specimen ej guard]
            compiler.addInstruction("ROT", 0)
            compiler.addInstruction("ROT", 0)
            # [guard specimen ej]
            compiler.call(u"coerce", 2)
            # [specimen]
        index = compiler.addFrame(u"_makeFinalSlot")
        compiler.addInstruction("NOUN_FRAME", index)
        compiler.addInstruction("SWAP", 0)
        # [_makeFinalSlot specimen]
        compiler.call(u"run", 1)
        index = compiler.addLocal(self._n)
        compiler.addInstruction("BINDSLOT", index)
        # []

This pattern is compiled to insert a specimen into the environment, compiling the optional guard along the way and ensuring order of operations. The interspersed comments represent the top of stack in-between operations, because it helps me keep track of how things are compiled.

With this representation, the Compiler is able to see the names and indices of every binding introduced during compilation, which means that creating index-based frames as constant-size lists is easy. (I was going to say "trivial," but it was not trivial!)

I was asked on IRC about why I chose to adapt SmallCaps instead of other possible VMs. The answer is mostly that SmallCaps was designed and implemented by people that were much more experienced than me, and that I trust their judgement. I tried several years ago to design a much purer concatenative semantics for Kernel-E, and failed. SmallCaps works, even if it's not the simplest thing to implement. I did briefly consider even smaller semantics, like those of the Self language, but I couldn't find anything expressive enough to capture all of Kernel-E's systems. Ejectors are tricky.

That's all for now. Peace.

So apparently, this happened, and then this. Long story short, the elementary OS guys had been offered to use SPI as the legal entity to represent the project, something they didn't need at all, and since they didn't, Joshua Drake, apparently a director at SPI, decided to threat them with bad press all over if they didn't agree to join SPI. Which he then did, he started several threads on reddit and wrote a blog post trying to undermine the project, the post is now deleted, and this aberration of an apology (which is total BS and shows how much of an ass he is).

I seriously don't get why this guy has not been fired from the SPI organization immediately, this sort of bullying behaviour should not be allowed and, at least in my book, an apology means nothing. Someone like that does not belong to an organization that is supposed to help free software thrive and protect its communities.

I don't get how SPI expects the community to trust them at all after this.

I am really angry at this and I would like to express the elementary OS guys all my support.

February 11, 2015
Linux 3.19 was just released and my usual overview of what the next merge window will bring is more than overdue. The big thing overall is certainly all the work around atomic display updates, but read on for what else all has been done.

Let's first start with all the driver internal rework to support atomic. The big thing with atomic is that it requires a clean split between code that checks display updates and the code that commits a new display state to the hardware. The corallary from that is that any derived state that's computed in the validation code and needed int the commit code must be stored somewhere in the state object. Gustavo Padovan and Matt Roper have done all that work to support atomic plane updates. This is the code that's now in 3.20 as a tech preview. The big things missing for proper atomic plane updates is async commit support (which has already landed for 3.21) and support to adjust watermark settings on the fly. Patches for from Ville have been around for a long time, but need to be rebased, reviewed and extended for recently added platforms.

On the modeset side Ander Conselvan de Oliveira has done a lot of the necessary work already. Unfortunately converting the modeset code is much more involved for mainly two reaons: First there's a lot more derived state that needs to be handled, and the existing code already has structures and code for this. Conceptually the code has been prepared for an atomic world since the big display rewrite and the introduction of CRTC configuration structures. But converting the i915 modeset code to the new DRM atomic structures and interface is still a lot of work. Most of these refactorings have landed in 3.20. The other hold-up is shared resources and the software state to handle that. This is mostly for handling display PLLs, but also other shared resources like the display FIFO buffers. Patches to handle this are still in-flight.

Continuing with modeset work Jani Nikula has reworked the DSI support to use the shared DSI helpers from the DRM core. Jani also reworked the DSI to in preparation for dual-link DSI support, which Gaurav Singh implemented. Rodrigo Vivi and others provided a lot of patches to improve PSR support and enable it for Baytrail/Braswell. Unfortunately there's still issues with the automated testcase and so PSR unfortunately stays disabled by default for now. Rodrigo also wrote some nice DocBook documentation for fbc, another step towards fully documenting i915 driver internals.

Moving on to platform enabling there has been a lot of work from Ville on Cherryview: RPS/gpu turbo and pipe CRC support (used for automated display testing) are both improved. On Skylake almost all the basic enabling is merged now: PM patches, enabling mmio pageflips and fastboot support from Damien have landed. Tvrtko Ursulin also create the infrastructure for global GTT views. This will be used for some upcoming features on Skylake. And to simplify enabling future platforms Chris Wilson and Mika Kuoppala have completely rewritten the forcewake domains handling code.

Also really important for Skylake is that the execlist support for gen8+ command submission is finally in a good enough to be used by default - on Skylake that's the only support path, legacy ring submission has been deprecated. Around that feature and a few other closely related ones a lot of code was refactoring: John Harrison implemented the conversion from raw sequence numbers to request objects for gpu progress tracking. This as is also prep work for landing a gpu scheduler. Nick Hoath removed the special execlist request tracking structures, simplifying the code. The engine initialization code was also refactored for a cleaner split between software and hardware initialization, leading to robuster code for driver load and system resume. Dave Gordon has also reworked the code tracking and computing the ring free space. On top of that we've also enabled full ppgtt again, but for now only where execlists are available since there are still issues with the legacy ring-based pagetable loading.

For generic GEM work there's the really nice support for write-combine cpu memory mappings from Akash Goel and Chris Wilson. On Atom SoC platforms lacking the giant LLC bigger chips have this gives a really fast way to upload textures. And even with LLC it's useful for uploading to scanout targets since those are always uncached. But like the special-purpose uploads paths for LLC platforms the cpu mmap views do not detile textures, hence special-purpose fastpaths need to be written in mesa and other userspace to fully exploit this. In other GEM features the shadow batch copying code for the command parser has now also landed.

Finally there's the redesign from Imre Deak to use the component framework for the snd-hda/i915 interactions. Modern platforms need a lot of coordination between the graphics and sound driver side for audio over hdmi, and thus far this was done with some ad-hoc dynamic symbol lookups. Which results in a lot of headaches to get the ordering correctly for driver load or system suspend and resume. With the component framework this depency is now explicit, which means we will be able to get rid of a few hacks. It's also much easier to extend for the future - new platforms tend to integrate different components even more.
February 09, 2015

dell-dw5570

Your Dell modem not getting online?

It’s not uncommon to find weird mobile broadband modems that for one reason or another don’t end up working as expected with NetworkManager/ModemManager; but the new 3G/4G modems in Dell laptops are at a total different level. These Dell-branded devices are really Sierra Wireless powered modems, e.g. the Dell 5808 is a Sierra Wireless MC7355, or the Dell DW5570 is a Sierra Wireless MC8805.

Late last year we started to receive several bugreports in the ModemManager and libqmi mailing lists for these kind of devices. Basically, the modem would never get to a proper online mode with the RF transceivers powered and therefore would never even get registered in the mobile network. This was happening to both QMI and MBIM based configurations, and the direct error message reported by libqmi when trying to get into online mode was just… not very very helpful.


  $ sudo qmicli -d /dev/cdc-wdm1 --dms-get-operating-mode
  [/dev/cdc-wdm1] Operating mode retrieved:
    Mode: 'low-power'
    HW restricted: 'no'


  $ sudo qmicli -d /dev/cdc-wdm1 --dms-set-operating-mode=online
  error: couldn't set operating mode: QMI protocol error (3): 'Internal'

The issue was reported to the kernel, assuming that this would likely be a new missing rfkill related setup in newer Dell laptops. One of the users reported in that same bugreport that actually using Sierra’s GobiNet driver instead of qmi_wwan would end up putting the modem in online mode, so just switching drivers during boot would make it work. WTF?

Digging in Sierra’s GobiNet QMI driver

Well, without much hope of finding anything, and given that I had just bought such a Dell modem myself for testing a new “Dell” plugin, I decided to dig into Sierra’s kernel driver sources. Apart from some already known things (e.g. they use the WDA service to set the net data format in new modems instead of the old CTL service), these lines popped:

  if (is9x15)
  {
    // Set FCC Authentication
    result = QMIDMSSWISetFCCAuth( pDev );
    if (result != 0)
    {
      return result;
    }
  }

The Sierra GobiNet driver is sending some magic “FCC auth” command during boot to the modem; which according to the driver sources maps to command 0x555F in the DMS service. Hey I should try that!

Adding the new command support in libqmi wasn’t difficult, so in some minutes I was ready to test it… and worked.

  $ sudo qmicli -d /dev/cdc-wdm1 --dms-get-operating-mode
  [/dev/cdc-wdm1] Operating mode retrieved:
    Mode: 'low-power'
    HW restricted: 'no'


  $ sudo qmicli -d /dev/cdc-wdm1 --dms-set-fcc-authentication
  [/dev/cdc-wdm1] Successfully set FCC authentication


  $ sudo qmicli -d /dev/cdc-wdm1 --dms-get-operating-mode
  [/dev/cdc-wdm1] Operating mode retrieved:
    Mode: 'online'
    HW restricted: 'no'

Support for this is already available automatically when using libqmi and ModemManager git master. It will hit the next stable releases likely as well.

MBIM?

Well, I don’t know if there is any command in MBIM to do the same operation (likely there is in a Sierra-specific service), but one thing we could anyway try to do is to use “QMI embedded in MBIM“, which Bjørn has already tested some times. I’ll try to test that some day, but I’ll need to get another modem as my DW5570 only comes up with a QMI configuration. For now, if you’re stuck with this problem using MBIM, you can likely just select USB configuration #1 using usb_modeswitch and get the modem switched to QMI mode.

TL;DR?

Dell-branded Sierra Wireless modems need the “FCC Auth” command (QMI DMS service, 0x555F) before they can be brought online; supported in libqmi and ModemManager already.

[UPDATE]
These fixes have been already released in ModemManager 1.4.4 and libqmi 1.12.4.


Filed under: Development, FreeDesktop Planet, GNOME Planet, Planets Tagged: Dell, GobiNet, libqmi, MBIM, ModemManager, QMI, sierra-wireless
February 06, 2015

I just pushed a patchset into libinput to introduce the concept of device groups. This post will explain what they are in this context and why they are needed.

libinput exposes kernel devices as an opaque struct libinput_device. It only recognises evdev devices at this point, this may change in the future if we see a need for it. libinput also exposes a few bits of information about the device such as the name, PID/VID and a handle to the struct udev_device that matches this device. The latter enables callers to get more information from the device. libinput also provides a bunch of configuration settings for each device. Pointer devices get acceleration settings, absolute devices have calibration, etc. For most devices this works just fine.

Some devices like Wacom tablets are represented as multiple event nodes. On a 3.19 kernel you'd get three event nodes for an Intuos 5 touch - the pad (i.e. the tablet itself), a touch node and one node for all the tools (stylus, eraser, etc. multiplexed). libinput exposes each of these nodes as separate device, but that is problematic when applying certain configuration settings. For example, applying a left-handed configuration to the tablet means it's rotated by 180 degrees so we need to rotate the coordinates accordingly. Of course, such a rotation would have to apply to both the touch and the stylus devices but now the caller is left with having to figure out which other devices to set.

The original idea was to present such devices as a single, merged struct libinput_device with multiple capabilities, i.e. a single physical device that can do touch, tablet and pad buttons. A configuration setting like left-handed-ness would then apply to all devices transparently. The API is clean, usage is simple, everybody is happy. Except when they aren't - this doesn't actually work particularly well. First, having such merged devices means we require devices to change at runtime, adding/removing capabilities on-the-fly which puts a burden on the callers to handle this correctly. Second, not all configuration options apply to all subdevices. If the Intuos is used as a touchpad you may want natural scrolling enabled on the touchpad but the wheel on the Wacom mouse should probably still work normally. Third, the subdevices may have different PID/VIDs and certainly have different udev devices. So now libinput needs a way to get to those. In short, a merged device looks nice in theory but the implementation of it would make the libinput API cumbersome to use for little benefit.

The solution to this are device groups: each device in libinput is now part of a struct libinput_device_group. This is just an opaque object that doesn't do anything but sit there but it's enough to identify how devices are grouped together. If two devices return the same device group, they logically belong together. The caller can then decide what to do with it, e.g. loop through all devices of a group to apply a certain configuration setting to all devices. The basic approach is thus:


new_device = libinput_event_get_device(event);
new_group = libinput_device_get_device_group(new_device);
libinput_device_group_ref(new_group);

for each (device, group) in previously_stored_devices {
if (group == new_group)
printf("This device shares a group with %s", device);
}
The device groups' lifetime is as you'd expect: it is created for the first device in the group and ceases once the last device in a group is removed. It's not deleted until the last reference was deleted but it won't get recycled. In other words, if you keep unplugging and re-plugging that Intuos tablet, the device group will be new after every plug.

Note that we're intentionally not providing ways to get the devices from a device group, or counting the devices within a group, etc. This avoids race conditions (the view libinput has of the devices isn't the same as the caller has while going through the event queue) but it also makes the API simpler. libinput's callers are mainly compositors which use toolkits with advanced datastructures (glib, Qt, etc.). Using a pointer as key into a hashmap is simpler and less buggy than using whatever hand-crafted hashmap/list implementation we can provide through the libinput API.

January 28, 2015
Another kernel release is imminent and a lot of things happened since my last big blog post about atomic modeset. Read on for what new shiny things 3.20 will bring this area.

Atomic IOCTL and Properties


The big thing for sure is that the actual atomic IOCTL from Ville has finally landed for 3.20. That together with all the work from Rob Clark to add all the new atomic properties to make it functional (there's no IOCTL structs for any standard values, everything is a property now) means userspace can finally start using atomic. Well, it's still hidden behind the experimental module option drm.atomic=1 but otherwise it's all there. There's a few big differences compared to earlier iterations:
  • Core properties are now fully handled by the core, drivers can only decode their own properties. This should mean that the handling of standardized properties should be more uniform and also makes it easier to extend (e.g. with the standardized rotation property we already have) beyond the paramaters the legacy interfaces provided.
  • Atomic props&ioctl are opt-in per file_priv, userspace needs to explicitly ask for it. This is the same idea as with universal plane support and will make sure that old userspace doesn't get confused and fall over when it would see all the new atomic properties. In the same spirit some of the legacy-only properties (like DPMS) will be rejected in the atomic IOCTL paths.
  • Atomic modesets are currently not possible since the exact ABI for how to handle the mode property is still under discussion. The big missing thing is handling blob properties, which will also be needed to upload gamma table updates through the atomic interface.
Another missing piece that's now resolved is the DPMS support. On a high level this greatly reduces the complexity of the legacy DPMS settings into a  simple per-CRTC boolean. Contemporary userspace really wants nothing more and anything but a simple on/off is the only thing that current hardware can support. Furthermore all the bookkeeping is done in the helpers, which call down into drivers to enable or disable entire display pipelines as needed. Which means that drivers can rip out tons of code for the legacy DPMS support by just wiring up drm_atomic_helper_connector_dpms.

New Driver Conversions and Backend Hooks


Another big thing for 3.20 is that driver support has moved forward greatly: Tegra has now most of the basic bits ready, MSM is already converted. Both still lack conversion and testing for DPMS since that landed very late though. There's a lot of prep work for exynos, but unfortunately nothing yet towards atomic support. And i915 is in the process of being converted to the new structures and will have very preliminary atomic plane updates support in 3.20, hidden behind a module option for now.

And all that work resulted in piles of little fixes. Like making sure that the legacy cursor IOCTLs don't suddenly stall a lot more, breaking old userspace. But we also added a bunch more hooks for driver backends to simplifiy driver code:
  • For drivers there's often a very big difference between a plane update, and disabling a plane. And with atomic state objects it's a bit more work to figure out when exactly you're doing a on to off transition. Thierry Redding added a new optional ->atomic_plane_disable() hook for drivers which will take care of all those disdinctions.
  • Mostly for hysterical raisins going back to the original Xrandr implementations for userspace mode setting the callbacks to enable and disable encoders and CRTCs used by the various helper libraries have really confusing names. And with the legacy helpers all kinds of strange semantics. Since the atomic helpers massively simplify things in this area it made a lot of sense to add a new set of ->enable() and ->disable() hooks, which are preferred if they're set. All the other hooks (namely ->prepare(), ->commit() and ->dpms()) will eventually be deprecated for atomic drivers. Note that ->mode_set is already superseeded by ->mode_set_nofb due to the explicit primary plane handling with atomic updates.
Finally driver conversions showed that vblank handling requirements imposed by the atomic helpers are a lot stricter than what userspace tends to cope with. i915 has recently reworked all its vblank handling and improved the core helpers with the addition of drm_crtc_vblank_off() and drm_crtc_vblank_on(). If these are used in the CRTC disable and enable hooks the vblank code will automatically reject vblank waits when the pipe is off. Which is the behaviour both the atomic helpers and the transitional helpers expect.

One complication is that on load the driver must ensure manually that the vblank state matches up with the hardware and atomic software state with a manual call to these functions. In the simple case where drivers reset everything to off (which is what the reset implementations provided by the atomic helpers presume) this just means calling drm_crtc_vblank_off() somewhen in the driver load code. For drivers that read out the actual hardware state they need to call either _off() or _on() matching on the actual display pipe status.

Future Work


Of course there's still a few things left to do before atomic display updates can be rolled out to the masses. And a few things that would be rather nice to have, too:
  • Support for blob properties so that modesets and a bunch of other neat things can finally be done.
  • Testing, and lots of it, before we enable atomic by default.
  • Thanks to DP MST connectors can be hotplugged, and there are still a lot of lifetime issues surrounding connector handling all over the drm subsystem. Atomic display code is unfortunately no exception.
  • And finally try to make the vblank code a bit more generally useful and then implement generic async atomic commit on top of that.
It's promising to keep interesting!
January 27, 2015

Yesterday I released version 0.8 of AppStream, the cross-distribution standard for software metadata, that is currently used by GNOME-Software, Muon and Apper in to display rich metadata about applications and other software components.

 What’s new?

The new release contains some tweaks on AppStreams documentation, and extends the specification with a few more tags and refinements. For example we now recommend sizes for screenshots. The recommended sizes are the ones GNOME-Software already uses today, and it is a good idea to ship those to make software-centers look great, as others SCs are planning to use them as well. Normal sizes as well as sizes for HiDPI displays are defined. This change affects only the distribution-generated data, the upstream metadata is unaffected by this (the distro-specific metadata generator will resize the screenshots anyway).

Another addition to the spec is the introduction of an optional <source_pkgname/> tag, which holds the source package name the packages defined in <pkgname/> tags are built from. This is mainly for internal use by the distributor, e.g. it can decide to use this information to link to internal resources (like bugtrackers, package-watch etc.). It may also be used by software-center applications as additional information to group software components.

Furthermore, we introduced a <bundle/> tag for future use with 3rd-party application installation solutions. The tag notifies a software-installer about the presence of a 3rd-party application bundle, and provides the necessary information on how to install it. In order to do that, the software-center needs to support the respective installation solution. Currently, the Limba project and Xdg-App bundles are supported. For software managers, it is a good idea to implement support for 3rd-party app installers, as soon as the solutions are ready. Currently, the projects are worked on heavily. The new tag is currently already used by Limba, which is the reason why it depends on the latest AppStream release.

How do I get it?

All AppStream libraries, libappstream, libappstream-qt and libappstream-glib, are supporting the 0.8 specification in their latest version – so in case you are using one of these, you don’t need to do anything. For Debian, the DEP-11 spec is being updated at time, and the changes will land in the DEP-11 tools soon.

Improve your metadata!

This call goes especilly to many KDE projects! Getting good data is partly a task for the distributor, since packaging issues can result in incorrect or broken data, screenshots need to be properly resized etc. However, flawed upstream data can also prevent software from being shown, since software with broken data or missing data will not be incorporated in the distro XML AppStream data file.

Richard Hughes of Fedora has created a nice overview of software failing to be included. You can see the failed-list here – the data can be filtered by desktop environment etc. For KDE projects, a Comment= field is often missing in their .desktop files (or a <summary/> tag needs to be added to their AppStream upstream XML file). Keep in mind that you are not only helping Fedora by fixing these issues, but also all other distributions cosuming the metadata you ship upstream.

For Debian, we will have a similar overview soon, since it is also a very helpful tool to find packaging issues.

If you want to get more information on how to improve your upstream metadata, and how new metadata should look like, take a look at the quickstart guide in the AppStream documentation.

Beware of promo-newa.com, they scammed my colleague and friend Richard. As someone with experience in importing from China I know how scary and risky it can be so I completely sympathize with them. Apparently they sent hacked 96Mb flash drives that reported to be 1Gb flash drives.

Let's make sure the internet is filled with references to this scam. Also, if you live in Shenzhen and/or can think of any of helping him that'd be really nice.

At the very least, make sure you share this post around in your preferred social media wall!

Yesterday I arrived to Cambridge to attend the DevX hackfest. Loads of good stuff going on, I am mostly focusing on trying to integrate the hundreds of ignored pull requests we're getting in Github's mirror with Bugzilla automatically. In the meantime loads of interesting discussions about sandboxing, Builder, docs and mallard balls being thrown all over the place and hitting my face (thanks Kat).

It is real nice to catch up with everyone, we went for dinner to a pretty good Korean place, I should thank Codethink for kindly sponsoring the dinner. Afterwards we went to The Eagle pub, apparently the place where DNA discovery was celebrated and discussed.

And this morning we are celebrating Christian Hergert making it to the 50K stretch goal just before the end of the crowdfunding campaign for GNOME Builder.

I would like to thank my employer, Red Hat, for sponsoring my trip here too.

January 26, 2015

Jobs at Red Hat
So I got a LOT of responses to my blog post about the open positions we have here at Red Hat working on Fedora and the Desktop. In fact I got so many it will probably take a bit of time before we can work through them all. So you might have to wait a little bit before getting a response from us. Anyway, thanks you to everyone who sent me their CV, much appreciated and looking forward to working with those of you we end up hiring!

Builder campaign closes in 13 hours
I want to make one last pitch for everyone to contribute to the Builder crowdfunding campaign. It has just passed 47 000 USD as I write this, which means we just need another 3000 USD to reach
the graphical debugger stretch goal. Don’t miss out on this opportunity to help this exciting open source project!

January 24, 2015

Next week  I am traveling to Brussels to attend FOSDEM but this time as speaker! My talk will explain why we need to test our OpenGL drivers with OpenGL conformance test suites, list the unofficial Free software ones and give a brief introduction about how to use them.

See you there!

FOSDEM 2015Update: The slides are available online.

January 22, 2015

It seems we can't ever get rid of the issues with this series. Daniel Martin filed a kernel bug for the latest series of these devices (Oct 2014) and it looks like they all need manual fixing again.

When the *40 series first came out, the PS/2 firmware was buggy and advertised bogus coordinate ranges for the x/y axes. Since we use those coordinate ranges to set up the size and position of software buttons (very much needed since that series did away with the physical trackpoint buttons) we added kernel patches for each of those laptops. The kernel would match on the PNPID (e.g. LEN0036 on a T440) and fix the min/max range for both axes based actual measurements. Since this requires someone to have a laptop, measure it, file a bug or send a patch, wait for it to get into the kernel, wait for it to get into distros it took quite a while to get all models supported.

Lenovo has updated the series in Oct 2014 and it's starting to get in the hands of users. And as it turns out, the touchpads now have different coordinate ranges but the same PNPID. And the values reported by the firmware are still bogus, so we need the same quirk, again, for each touchpad. Update 22/01/15: looks like the ranges are correct enough, so we don't need to update all ranges separately.

So in short: if you have one of the latest series *40 touchpads, your touchpad software buttons will be off. CC yourself on the kernel bug and if you have a model that's not listed there yet, add the required data. Eventually that will end up in the kernel and then everything is hunky-dory again. Until then, have a drink on behalf of the Synaptics/Lenovo QA departments.

Now the obvious question: why does this work with Windows? They don't use the PS/2 protocol but the SMBus/RMI4 interface and thus PS/2 firmware correctness is apparently not top priority for the QA departments. But the SMBus protocol requires the Host Notify feature, which caused Synaptics to reimplement the i2c driver for Windows. And that's what is shipped/preinstalled as driver. We don't support Host Notify on Linux, so there goes that idea. But there's strong suspicion that's not the only piece of the puzzle that's missing anyway...

Update 22/01/15: The min/max ranges advertised seem to be correct in the newer versions which would indicate that Synaptics has fixed the firmware. That's great (except for re-using the PNPID). Now we need to just detect that and drop the quirks for the newer touchpads. Hans has a good suggestion for how to do this, so with a bit of luck this will end up being only one kernel patch instead of one per device.

January 21, 2015

So Red Hat are currently looking to hire into the various teams building and supporting efforts such as the Fedora Workstation, the Red Hat Enterprise Linux Workstation and of course Fedora and RHEL in generaL. We are looking at hiring around 6-7 people to help move these and other Red Hat efforts forward. We are open to candidates from any country where Red Hat has a presence, but for a subset of the positions candidates relocating to our main engineering offices in Brno, Czech Republic or Westford, Massachussets, USA, will be a requirement or candidates interested in that will be given preference. We are looking for a mix of experienced and junior candidates, so regardless of it you are fresh out of school or haven been around for a while this might be for you.

So instead of providing a list of job descriptions what I want to do here is list of various skills and backgrounds we want and then we will adjust the exact role of the candidates we end up hiring depending on the mix of skills we find. So this might be for you if you are a software developer and have one or more of these skills or backgrounds:

* Able to program in C
* Able to program in Ruby
* Able to program in Javascript
* Able to program in Assembly
* Able to program in Python
* Experience with Linux Kernel development
* Experience with GTK+
* Experience with Wayland
* Experience with x.org
* Experience with developing for PPC architecture
* Experience with compiler optimisations
* Experience with llvm-pipe
* Experience with SPICE
* Experience with developing software like Virtualbox, VNC, RDP or similar
* Experience with building web services
* Experience with OpenGL development
* Experience with release engineering
* Experience with Project Atomic
* Experience with graphics driver enablement
* Experience with other PC hardware enablement
* Experience with enterprise software management tools like Satellite or ManageIQ
* Experience with accessibility software
* Experience with RPM packaging
* Experience with Fedora
* Experience with Red Hat Enterprise Linux
* Experience with GNOME

It should be clear from the list above that we are not just looking for people with a background in desktop development this time around, two of the positions for instance will mostly be dealing with Linux kernel development. We are looking for people here who can help us make sure the latest and greatest laptops on the market works great with Fedora and RHEL, be that on the graphics side or in terms of general hardware enablement. These jobs will put you right in the middle of the action in terms of defining the future of the 3 Fedora variants, especially the Workstation; defining the future of Red Hats Enterprise Linux Workstation products and pushing the Linux platform in general forward.

If you are interested in talking with us about if we can find a place for you in Red Hat as part of this hiring round please send me your CV and some words about yourself and I will make sure to put you in contact with our recruiters. And if you are unsure about what kind of things we work on here I suggest taking a look at my latest blog about our Fedora Workstation 22 efforts to see a small sample of some of the things we are currently working on.

You can reach me at cschalle(at)redhat(dot)com.

January 19, 2015

A Fedora 22 feature is to use the libinput X.Org driver as default driver for all non-tablet devices. This replaces the current defaults of synaptics for touchpads and evdev for anything else (tablets usually use the wacom driver, if installed). As expected, changing a default has some repercussions both for users and for developers. These are outlined below, based on the libinput 0.8 release. Future versions may add features, so check with your latest local version. Generally, the behaviour should roughly stay the same, big changes such as devices not being detected or not working is most likely a bug. Some behaviours are new, e.g. always-on palm detection, top software buttons on specific touchpads, etc. If in doubt, check the libinput documentation for hints on whether something is supposed to work in a particular manner.

Changes visible to Users

Any custom xorg.conf snippets will cease to work, if they are properly stacked. Options set by snippets are almost always exclusive to one particular driver. When the default driver changes, the snippet may not apply to the device anymore. Whether they stop working depends whether the Driver line is present. Consider this example snippet:

Section "InputClass"
Identifier "enable tapping"
MatchProduct "my touchpad"
Driver "synaptics"
Option "TapButton1" "1"
EndSection
This snippet does two things: it assigns the synaptics driver to the "my mouse" device and sets the option TapButton1. The assignment will override the default libinput assignment, i.e. this device won't change behaviour, you just don't get to use any new features. If the Driver line is not present then this snippet won't do anything, the libinput driver does not provide a TapButton1 option. It is safe to leave the snippet in place, options that are not supported by a driver are simply ignored.

The xf86-input-libinput man page has a list of options that can be set. For example, the above snippet would have an equivalent as


Section "InputClass"
Identifier "enable tapping"
MatchDriver "libinput"
MatchProduct "my touchpad"
Option "Tapping" "on"
EndSection
Note that this matches on a driver rather than assign the driver. Since options are driver-specific this is the correct approach.

The other visible change is a difference in default pointer speed. We have fine-tuning pointer acceleration on our TODO lists, but we're not quite there yet and any help would be appreciated. In the meantime you may see changes in pointer acceleration.

Finally, you may see certain features have gone the way of the dodo. Specifically the touchpad code exposes a lot less knobs to tweak. Most of that is intentional, some of it may warrant discussion. If there is a particular feature you are missing or doesn't work as you want to, please file a bug.

Changes visible to developers

These changes affect desktop environments, specifically the part that configures input devices. The changes affect three categories: pointer acceleration, button mapping, touchpad disabling and device properties. The property "libinput Send Events Modes Available" exists on all devices, it can be used to determine if a device is handled by the libinput driver.

Pointer acceleration

The X server exposes a variety of knobs for its pointer acceleration code. The oldest knob (and specified in the core protocol) is the XChangePointerControl request. In some environments this is exposed as a single slider, in others it's split into multiple settings (Acceleration and Threshold, for example).

libinput does away with this and only exposes a single 1-value float property "libinput Accel Speed" with a range of -1 (slowest) to 1 (fastest). The XChangePointerControl request has no effect on a libinput device. It is up to you how to map the current speed mappings into the [-1, 1] range.

Button mapping

The X server provides button mapping through the XSetPointerMapping request. This is most commonly used to apply a left-handed configuration and to enable natural scrolling. The call will continue to work with the libinput driver, but better methods are available.

The property "libinput Left Handed Enabled" takes a single boolean 8-bit value to enable and disable left-handed mode. Unlike the X request this will automatically take care of the tapping configuration (and other things in the future). If the property is not available on a device, that device has no left-handed mode.

The property "libinput Natural Scrolling Enabled" takes a single boolean 8-bit value to enable and disable natural scrolling. This applies to smooth scrolling and legacy button scrolling (which the libinput driver doesn't do anyway). If the property is not available on a device, that device has no natural scrolling mode.

Touchpad disabling

In the synaptics driver, disabling the touchpad is usually done with the "Synaptics Off" property. This is used by syndaemon to turn the touchpad off while typing. libinput does this by default, so it is safe to simply ignore this at all. Don't bother starting syndaemon, it won't control the libinput driver.

Device properties

Any code that handles a driver-specific property (prefixed by "evdev" or "synaptics") will stop working. These properties are not exposed by the libinput driver (we tried, it was not viable). KDE's kcm_touchpad module is a particularly bad offender here, it exposes almost every toggle the driver ever had. Make sure the code behaves well if the properties are not present and you're basically good to go.

If you decide to handle libinput-specific properties, the general rule is: if a single-value property is not present, that configuration does not apply to this device. Bitmask-style properties are split into an "libinput foo Available" and "libinput foo Enabled". The former lists the available settings, the latter enables a specific setting. Have a look at the xf86-input-libinput source for details on each property.

So Fedora Workstation 21 is done and out and I am extremely pleased to see the positive reception and great reviews. But we are not resting on our laurels here and are already busy planning for the Fedora Workstation 22 release. As many of you might know Fedora Workstation 22 is going to come up relatively fast, so we only have about 6 more weeks of development on it feature the freezes starts to kick inn. Luckily we have a relatively long list of items that we started working on during the Fedora Workstation 21 cycle that is nearing completing and thus should make the next release. We are of course also working on bigger long term developments that you should maybe see the first outline of in Fedora 22, but not the final version. I thought it would be nice to summarize some of the bigger items we expect to land and link to the relevant blogs and announcements for each one.

Wayland
So first out is to give an update on our work on Wayland as I know that is something a lot of people are curios about. We are continuing to make great strides forward and recently hired Jonas Ådahl to the team who many might recognize as an active Wayland and libinput developer. He will be spearheading our overall Wayland effort as we are approaching the finish line. All in all things are looking good, we got a lot of the basic plumbing in place for Fedora Workstation 21, so most works these days is mostly focused on polish and cleanups. One of the bigger items is the migration to use libinput. libinput is a library we decided to create to be able to share input device handling between X and Wayland and thus make the transition smoother and lower our workload during the transition period. Libinput itself is getting very close to feature complete and they are even working on some new features for it now taking it beyond what was in X. Peter Hutterer recently released version 0.8 and we expect to have 1.0 out and in use for both X.org and Wayland in time for Fedora Workstation 22.
In parallel we are also working on porting the needed bits in GNOME over to use libinput and remove any lingering X dependencies, like the GNOME Control Center which should also be ready for Fedora Workstation 22.

Another major change related to Wayland in Fedora Workstation 22 is that we will switch the default backends in GTK+ and SDL over to using Wayland. Currently in Fedora Workstation 21 applications are actually running on top of XWayland, but in Fedora Workstation 22 at least GTK+ and SDL applications will be default to Wayland when run under the Wayland session.
The Wayland SDL backend has been around for quite a bit, but Jonas Ådahl plans on spending some time smoothing out the last rough edges, in fact for SDL applications we hope we can actually provide noticeable performance improvement over X in some cases (not because OpenGL will be faster of course, but because we might be able to be smarter about handling different resolutions between desktop and game), but we have to wait and see if that pans out or if we have to settle for performance parity with X. We are also looking at getting the login session to use Wayland by default. All in all this should take us a huge step forward towards making using Wayland feel real.

As it looks now Wayland should be quite close to what you would define as feature complete for Fedora Workstation 22, but one thing that is going to take longer to reach maturity is the support for binary drivers, especially the NVidia ones. This of course is a task that mostly falls on NVidia for natural reasons, but we are trying to help out by Adam Jackson working to making sure Mesa works with their proposed EGLStreams and OpenGL Dispatcher proposals. So during the course of the coming year we will likely have a situation that you will be able to have a production ready Wayland session if you are running any of the open source drivers, but if you want to run Wayland on top of the NVidia binary driver that is most likely to only really be possible towards the end of the year. That said this is a guesstimate from our side as how quick the heavy lifting will happen, and how quickly it will be released by NVidia for public consumption is of course all relying on internal plans and resources at NVidia and not something we control.

Battery life
One thing we know being developers ourselves and from speaking with developers about their operating system of choice, battery life is among the top 5 reasons for what choice people make about their hardware and software. Due to this Owen Taylor has been investigating for some time now both what solutions exist today, what other operating systems are doing and what approaches we can take to improve battery life. Because a common complaint we hear from a lot of people is that they don’t feel they get great battery life when running linux on their laptops currently. Some people are able to solve this using powertop, but we feel there are a lot of room for automatically give our users better battery life beyond manual tweaking user powertop.

Improving battery life is a complex issue in many ways, including figuring out how to measure battery life. I guess everyone has seen laptops advertised with X number of hours of battery life, but it is our impression that those numbers tend to be quite bogus even when running the bundled operating system. In some testing we done we concluded that the worst offenders numbers could only true if you left your laptop idle in the corner with the screen blacked out. So gnome-battery-bench will help us achieve a couple of things, it should generate comparable battery lifetime numbers which both should help our users choose the hardware that gives the best battery life under linux and it also lets us as developers keep tracking how changes affect battery life so that we can catch regressions for instance. It also lets us verify the effect various kernel tuneables or ambient light detection schemes have on battery life in a better way than we can with existing tools. We also hope to use this to work with vendors to improve the battery life of their hardware when running Fedora or RHEL. Anyway, I suggest reading Owens Taylors blog for some more details of his work on improving battery life..

One important effort we want to undertake here, which might not all make it for Fedora Workstation 22, is taking advantage of the ambient light detectors in many modern laptops. One of the biggest battery drains in your system is the screen brightness setting and by using the ambient light detection hardware we hope to be able to put in place some intelligent behavior for different situations. This is a hard problem though and it was attempted solved in GNOME before, but the end result back then was that people felt they where “fighting” GNOME over their laptop brightness settings, so in the end it was dropped completely, so we need to careful to not repeat that outcome.

Application bundles
Another major effort that is not going to ready for Fedora Workstation 22, but which we might have some preview of is Application bundles. Matthias Clasen recently sent out an email to the Fedora Desktop mailing list outlining the state of the application bundles work. This is a continuation of the Sandboxed Applications in GNOME proposal from Lennart Poettering. The effort is being spearheaded by Alexander Larsson and the goal is to build the infrastructure needed to do sandboxed desktop applications efficiently. There is a wiki page up already detailing Sandboxed Apps and there are some test applications already available. For instance you can grab an application bundle of Builder, the cool new IDE project from Christian Hergert. (Hint, make sure to support his Builder crowdfunding effort if you have not already.). Once this effort matures it will revolutionize how desktop applications are built and distributed. It should make life easier for application developers as these bundled applications are designed to be distribution agnostic and the sandboxing aspect should help improve security. Also the transition should put the application developers directly in charge of the update cycle of their applications enabling them to better support their users.

3rd Party Application handling
So the ever resourceful Richard Hughes has been working on adding support for handling 3rd party applications in GNOME Software. He outlined this effort in a recent blog post about GNOME Software.

While the end goal here it to offer 3rd party application bundles as described in the section above, the feature has also a lot of near term advantages. We have seen that over the course of the last years we moved from a model where you use your browser to search for software online to users expecting to find all software available for a system through its app store. With this 3rd party application support available in GNOME Software we can start working to make that expectation a reality also in Fedora. We took great strides forward in Fedora Workstation 21 with having metadata available for most of the standard applications packaged in Fedora, but there is also a lot of popular applications and other things out there that people tend to install and use which we for various reasons are not interested or able to ship in Fedora. The reason for this can range from licensing issues, to packaging issues to simply resource issues. With Richards work we will be able to make such software discoverable in Fedora, yet make a clear distinction between the software we have vetted and checked and the software you get from 3rd parties.

How to deal with 3rd party software has been a long and ongoing discussion in the Fedora community, and there are a lot of practical and principal details to deal with, but hopefully with this infrastructure in place it will be a lot easier to navigate those issues as people have something concrete to relate to instead of just abstract ideas and concepts.

One challenge for instance we have to figure out is that on one side we don’t want 3rd party software offered in Fedora to be some for of endorsement or sign of being somehow vetted by the Fedora Project on an ongoing basis, yet on the other side the list will most likely need to be based on some form of editorial process to for instance protect both Red Hat and Fedora from potential legal threats. I plan on sending an initial proposal to the Workstation Working Group soon for how this can work and once we hashed out the details there we will need to bring the Working groups proposal into the wider Fedora community as this also affects our Cloud and Server offering.

File Manager
A lot of people these days use Google Drive, be that personally or because their company has a corporate Google apps account. So to make life easier for our users we are making sure that Nautilus are able to treat your Google drive as just another drive on the system, letting you drag and drop files off or on it. We also dedicated some effort to clean up and modernize the file manager in general, with Carlos Soriano blogging about his efforts there just before Christmas. All in all I think these are improvements that should improve the life of our developer and sysadmin target audience, but of course they are also very useful improvements to the general linux using public.

Qt Theming
One of the things we had to postpone for Fedora Workstation 21 was the Adwaita theme for Qt applications. We are expecting it to hit Fedora Workstation 22 though and you can get the theme to install and test from Martin Briza copr repository. The end goal here is wether you run a pure Qt application like Skype or Scribus, or a KDE application like Krita or Amarok, you should get an Adwaita look and feel to the application. Of course desktop integration isn’t just about a theme, there is a reason the GNOME HIG exists, but this should be an improvement over the current situation. The theme currently targets Qt4, but of course Qt5 is also on the roadmap for a later release.

Further terminal improvements
As I mentioned in an earlier blog entry about Fedora Workstation we realize that the terminal is the most important application for many developers and sysadmins. So we are also hoping to land some more of the terminal improvements we been working on in Fedora Workstation 21. The notifications for long running jobs being maybe the thing I know a lot of developers are excited about getting their hands on. It will let you for instance start a long compile in a terminal and know that you will be notified once it is completed instead of having to manually check in from time to time.

More development tools
In my opinion the best IDE for Python development currently is PyCharm. And not only is it the best from a functionality standpoint they also decided to release an open source version last year. That said we have been struggling a bit with the packaging of PyCharm, and interestingly enough it is one of those applications I think will benefit greatly from the application bundle work we are doing, but in the meantime we at least do have a Copr of PyCharm available. It is still an open question, but we might make this CoPR one of our testcases for the 3rd party application support in GNOME Software as mentioned earlier. Anyway if you are a Python developer I strongly recommend taking a lot at it. Personally I looked at various Python IDEs over the years, but always ended up just going back to Gedit, but when trying PyCharm it was the first time I felt that the application actually offered me useful functionality beyond what a text editor like Gedit does. Also in recent versions they also deal well with the introspection based Python bindings for GTK3 which was a great boon for me.

We are also looking at improving the development story around Vagrant and doing Fedora and RHEL development, more details on that at a later point.

ABRT improvements
The ABRT tool has become a crucial development tool for us over the last couple of years. The Fedora Retrace server is one of our main tools
for prioritizing which bugs to look at first and a crucial part of our goal of making Fedora a solid
distribution. That said, especially its early days, ABRT has had its share of detractors and people
being a bit frustrated with it, so Bastien Nocera and Allan Day has been working with the ABRT team to both integrate
it further with the desktop, for instance ensuring that it follows your desktop wide privacy settings
and to make sure that the user experience of submitting a retrace report is as smooth and intuitive as possible
and not to mention as unobtrusive as possible, for instance you don’t want ABRT to choke your system while trying to generate
a stack trace for us. The Fedora Workstation Tasklist contains links to bugzilla and github so you can track their progress.

Still a lot to do!
So making our vision for the Fedora Workstation come through takes of course a lot of effort from a lot of people. And we are really lucky to be part of such a great community where so much cool stuff is happening all the time. I mean the Builder effort from Christian Hergert as I talked about earlier is one of them, but there are so many other things happening too. So if you want to get involved take a look at our tasklist and see if there is anything that interests you or for that matter if there is something that you think should be worked on, but isn’t on the list yet. Then come join us either on #fedora-workstation on the freenode IRC network or join the fedora-desktop mailing list.

So I'm stuck somewhere on an airport and jetlagged on my return trip from my very first LCA in Auckland - awesome conference, except for being on the wrong side of the globe: Very broad range of speakers, awesome people all around, great organization and since it's still a community conference none of the marketing nonsense and sales-pitch keynotes.

Also done a presentation about botching up ioctls. Compared to my blog post a bunch more details on technical issues and some overall comments on what's really important to avoid a v2 ioctl because v1 ended up being unsalvageable. Slides and video (curtesy the LCA video team).
January 15, 2015

libinput 0.8 was released yesterday. One feature I'd like to talk about here: the change to provide mouse wheel events as physical distances.

Mouse wheels are clicks. In the evdev protocol they're sent via the REL_WHEEL and REL_HWHEEL axes, with a value of 1 per click. Spinning the wheel fast enough will give you a higher value per event but it's still just a multiple of the physical clicks. This is a conundrum for libinput.

libinput exports scroll events as "axis" events, the value returned by libinput_event_pointer_get_axis_value() for touchpads and button scrolling is in "pixels". libinput doesn't have a concept of pixels of course but the compositor will likely take the relative motion and apply it to the cursor. Scroll events are in the same coordinate space, i.e. the scrolling for two-finger scrolling has the same feel as moving the pointer. This continuous coordinate space is at odds with the discrete values coming from a wheel. We added axis sources to the API so you can now tell whether an event was generated by the wheel or some other scroll methods. But still, the discrete values from a wheel are at odds with the other sources.

For libinput 0.8, we changed the default reporting mode of the wheel. For the click count, a new call libinput_event_pointer_get_axis_value_discrete() provides that number (positive and negative for both direction). The libinput_event_pointer_get_axis_value() on a wheel now returns the movement of the wheel in degrees. That gives us a continuous coordinate space that also opens up new possibilities: if you know the rotation of a mouse wheel in degrees, you know things like "has the wheel been turned a full rotation". I don't quite know how, but I'm sure there are interesting interfaces you can make from that :)

Of course, the physical properties don't change, the degrees are always a multiple of the click count, and on most mice one click count is a 15 degree movement. The Logitech M325 is a notable exception here with a 20 degree angle. This isn't advertised by the hardware of course so we rely on the udev hwdb to set it where it differs from the default. The patch for this has been pushed to systemd and will soon arrive at a distribution near you.

And to answer a question I'm sure will come up: those mice that support a free spinning scrollwheel don't change the reporting mode. The G500s for example merely moves a physical bit that provides friction and the click feel/noise. The device still reports in 15 degree angle counts. I strongly suspect that other mice with this feature are the same (if not, we can now work this continuous motion into libinput and handle it propertly).

January 14, 2015

After nearly 12 years working on Gentoo and hearing blathering about how “Gentoo is about choice” and “Gentoo is a metadistribution,” I’ve come to a conclusion to where we need to go if we want to remain viable as a Linux distribution.

If we want to have any relevance, we need to have focus. Everything for everybody is a guarantee that you’ll be nothing for nobody. So I’ve come up with three specific use cases for Gentoo that I’d like to see us focus on:

People developing software

As Gentoo comes, by default, with a guaranteed-working toolchain, it’s a natural fit for software developers. A few years back, I tried to set up a development environment on Ubuntu. It was unbelievable painful. More recently, I attempted the same on a Mac. Same result — a total nightmare if you aren’t building for Mac or iOS.

Gentoo, on the other hand, provides a proven-working development environment because you build everything from scratch as you install the OS. If you need headers or some library, it’s already there. No problem. Whereas I’ve attempted to get all of the barebones dev packages installed on many other systems and it’s been hugely painful.

Frankly, I’ve never come across as easy of a dev environment as Gentoo, if you’ve managed to set it up as a user in the first place. And that’s the real problem.

People who need extreme flexibility (embedded, etc.)

Nearly 10 years ago, I founded the high-performance clustering project in Gentoo, because it was a fantastic fit for my needs as an end user in a higher-ed setting. As it turns out, it was also a good fit for a number of other folks, primarily in academia but also including the Adelie Linux team.

What we found was that you could get an extra 5% or so of performance out of building everything from scratch. At small scale that sounds absurd, but when that translates into 5-6 digits or more of infrastructure purchases, suddenly it makes a lot more sense.

In related environments, I worked on porting v5 of the Linux Terminal Server Project (LTSP) to Gentoo. This was the first version that was distro-native vs pretending to be a custom distro in its own right, and the lightweight footprint of a diskless terminal was a perfect fit for Gentoo.

In fact, around the same time I fit Gentoo onto a 1.8MB floppy-disk image, including either the dropbear SSH client or the kdrive X server for a graphical environment. This was only possible through the magic of the ROOT and PORTAGE_CONFIGROOT variables, which you couldn’t find in any other distro.

Other distros such as ChromeOS and CoreOS have taken similar advantage of Gentoo’s metadistribution nature to build heavily customized Linux distros.

People who want to learn how Linux works

Finally, another key use case for Gentoo is for people who really want to understand how Linux works. Because the installation handbook actually works you through the entire process of installing a Linux distro by hand, you acquire a unique viewpoint and skillset regarding what it takes to run Linux, well beyond what other distros require. In fact I’d argue that it’s a uniquely portable and low-level skillset that you can apply much more broadly than those you could acquire elsewhere.

In conclusion

I’ve suggested three core use cases that I think Gentoo should focus on. If it doesn’t fit those use cases, I would suggest that we allow but not specifically dedicate effort to enabling those particulars.

We’ve gotten overly deadened to how people want to use Linux, and this is my proposal as to how we could regain it.


Tagged: gentoo
January 05, 2015

If you read this blog entry it is very likely that you are a direct beneficiary of open source and free software. Like myself you probably have been able to get hold of, use and tinker with software that in the old world of closed source dominance would all together have cost you maybe ten thousand dollars or more. So with the spirit of the Yuletide season fresh in mind it is time to open your wallet and support some important open source fundraising campaigns.

The first one is the Builder, an IDE of our GNOME which is an effort by the unstoppable Christian Hergert to create a truly powerful and modern IDE for GNOME. Christian has already made huge strides forward with his project since quiting his dayjob to start it, and helping fund him to cross the finish line would be greatly beneficial to us all. And I think it would make a wonderful addition to the Fedora Workstation effort, so this is an easy way for you to help us move that effort forward too. So head over to the fundraiser webpage or start by viewing the great fundraiser video below:

The second effort I want to highlight is the still ongoing fundraiser for the PiTiVi video editor. Since they started that effort they have raised 22190 USD of the 35 000 USD they need to get PiTiVi to a level where they are confident to make a 1.0 release. And I think we all agree that having a top notch video editor avaiable, especially one that uses GStreamer and thus helps improve our general multimedia story is very important. This effort also has a nice introduction video if I want to know more:

I have personally contributed money to both these efforts and I hope you will too! Both projects are crucial for the long term health of the ecosystem and both are done by credible teams with the right skills to succeed. So for those of us out of school and in paying jobs, setting aside for example 100 USD to help these two efforts should be an easy choice to make, the value we will get back easily dwarfs that amount.

January 01, 2015

I just pushed up a new branch to my LLVM repo that enables two important LLVM codegen features (machine scheduling and subreg livenes) for SI+ targets, which should improve performance of the radeonsi driver.

The biggest improvement that I’m seeing with this branch is the luxmark luxball OpenCL demo which is about 60% faster on my Bonaire. Other tests I’ve done show 10% – 25% improvements in performance. I haven’t done much OpenGL benchmarking, but I expect these changes will have much bigger impact on the OpenCL benchmarks, so OpenGL improvements may be in the lower end of that range. I still need more benchmark results to know for sure.

December 20, 2014
Just in time for the upcoming break, we have figured out how to do alpha-test, and now supertuxkart is rendering properly:



If you are wondering about the new stk beta, I have a build from a few weeks back which seems to render properly as well.. few rough edges but I think that is just from using random git commit-id for stk.  But we don't have enough gl3 features yet (on a3xx or a4xx) to be using the new rendering paths.

And gnome-shell works nicely too.  Still some rendering issues with xonotic.  And a little ways behind a3xx in piglit results, but not quite as much as I would have expected at this early stage.

Still missing are some optimizations that are important for certain use-cases (hw-binning support for games, GMEM bypass for UI/mipmap-generation/etc).  But the a420 in apq8084 (ifc6540 board) is surprisingly fast all the same.
December 17, 2014

Multi-Stream Transport 4k Monitors and X

I'm sure you've seen a 4k monitor on a friends desk running Mac OS X or Windows and are all ready to go get one so that you can use it under Linux.

Once you've managed to acquire one, I'm afraid you'll discover that when you plug it in, you're limited to 30Hz refresh rates at the full size, unless you're running a kernel that is version 3.17 or later. And then...

Good Grief! What Is My Computer Doing!

Ok, so now you're running version 3.17 and when X starts up, it's like you're using a gigantic version of Google Cardboard. Two copies of a very tall, but very narrow screen greets you.

Welcome to MST island.

In order to drive these giant new panels at full speed, there isn't enough bandwidth in the display hardware to individually paint each pixel once during each frame. So, like all good hardware engineers, they invented a clever hack.

This clever hack paints the screen in parallel. I'm assuming that they've got two bits of display hardware, each one hooked up to half of the monitor. Now, each paints only half of the pixels, avoiding costly redesign of expensive silicon, at least that's my surmise.

In the olden days, if you did this, you'd end up running two monitor cables to your computer, and potentially even having two video cards. Today, thanks to the magic of Display Port Multi-Stream Transport, we don't need all of that; instead, MST allows us to pack multiple cables-worth of data into a single cable.

I doubt the inventors of MST intended it to be used to split a single LCD panel into multiple "monitors", but hardware engineers are clever folk and are more than capable of abusing standards like this when it serves to save a buck.

Turning Two Back Into One

We've got lots of APIs that expose monitor information in the system, and across which we might be able to wave our magic abstraction wand to fix this:

  1. The KMS API. This is the kernel interface which is used by all graphics stuff, including user-space applications and the frame buffer console. Solve the problem here and it works everywhere automatically.

  2. The libdrm API. This is just the KMS ioctls wrapped in a simple C library. Fixing things here wouldn't make fbcons work, but would at least get all of the window systems working.

  3. Every 2D X driver. (Yeah, we're trying to replace all of these with the one true X driver). Fixing the problem here would mean that all X desktops would work. However, that's a lot of code to hack, so we'll skip this.

  4. The X server RandR code. More plausible than fixing every driver, this also makes X desktops work.

  5. The RandR library. If not in the X server itself, how about over in user space in the RandR protocol library? Well, the problem here is that we've now got two of them (Xlib and xcb), and the xcb one is auto-generated from the protocol descriptions. Not plausible.

  6. The Xinerama code in the X server. Xinerama is how we did multi-monitor stuff before RandR existed. These days, RandR provides Xinerama emulation, but we've been telling people to switch to RandR directly.

  7. Some new API. Awesome. Ok, so if we haven't fixed this in any existing API we control (kernel/libdrm/X.org), then we effectively dump the problem into the laps of the desktop and application developers. Given how long it's taken them to adopt current RandR stuff, providing yet another complication in their lives won't make them very happy.

All Our APIs Suck

Dave Airlie merged MST support into the kernel for version 3.17 in the simplest possible fashion -- pushing the problem out to user space. I was initially vaguely tempted to go poke at it and try to fix things there, but he eventually convinced me that it just wasn't feasible.

It turns out that all of our fancy new modesetting APIs describe the hardware in more detail than any application actually cares about. In particular, we expose a huge array of hardware objects:

  • Subconnectors
  • Connectors
  • Outputs
  • Video modes
  • Crtcs
  • Encoders

Each of these objects exposes intimate details about the underlying hardware -- which of them can work together, and which cannot; what kinds of limits are there on data rates and formats; and pixel-level timing details about blanking periods and refresh rates.

To make things work, some piece of code needs to actually hook things up, and explain to the user why the configuration they want just isn't possible.

The sticking point we reached was that when an MST monitor gets plugged in, it needs two CRTCs to drive it. If one of those is already in use by some other output, there's just no way you can steal it for MST mode.

Another problem -- we expose EDID data and actual video mode timings. Our MST monitor has two EDID blocks, one for each half. They happen to describe how they're related, and how you should configure them, but if we want to hide that from the application, we'll have to pull those EDID blocks apart and construct a new one. The same goes for video modes; we'll have to construct ones for MST mode.

Every single one of our APIs exposes enough of this information to be dangerous.

Every one, except Xinerama. All it talks about is a list of rectangles, each of which represents a logical view into the desktop. Did I mention we've been encouraging people to stop using this? And that some of them listened to us? Foolishly?

Dave's Tiling Property

Dave hacked up the X server to parse the EDID strings and communicate the layout information to clients through an output property. Then he hacked up the gnome code to parse that property and build a RandR configuration that would work.

Then, he changed to RandR Xinerama code to also parse the TILE properties and to fix up the data seen by application from that.

This works well enough to get a desktop running correctly, assuming that desktop uses Xinerama to fetch this data. Alas, gtk has been "fixed" to use RandR if you have RandR version 1.3 or later. No biscuit for us today.

Adding RandR Monitors

RandR doesn't have enough data types yet, so I decided that what we wanted to do was create another one; maybe that would solve this problem.

Ok, so what clients mostly want to know is which bits of the screen are going to be stuck together and should be treated as a single unit. With current RandR, that's some of the information included in a CRTC. You pull the pixel size out of the associated mode, physical size out of the associated outputs and the position from the CRTC itself.

Most of that information is available through Xinerama too; it's just missing physical sizes and any kind of labeling to help the user understand which monitor you're talking about.

The other problem with Xinerama is that it cannot be configured by clients; the existing RandR implementation constructs the Xinerama data directly from the RandR CRTC settings. Dave's Tiling property changes edit that data to reflect the union of associated monitors as a single Xinerama rectangle.

Allowing the Xinerama data to be configured by clients would fix our 4k MST monitor problem as well as solving the longstanding video wall, WiDi and VNC troubles. All of those want to create logical monitor areas within the screen under client control

What I've done is create a new RandR datatype, the "Monitor", which is a rectangular area of the screen which defines a rectangular region of the screen. Each monitor has the following data:

  • Name. This provides some way to identify the Monitor to the user. I'm using X atoms for this as it made a bunch of things easier.

  • Primary boolean. This indicates whether the monitor is to be considered the "primary" monitor, suitable for placing toolbars and menus.

  • Pixel geometry (x, y, width, height). These locate the region within the screen and define the pixel size.

  • Physical geometry (width-in-millimeters, height-in-millimeters). These let the user know how big the pixels will appear in this region.

  • List of outputs. (I think this is the clever bit)

There are three requests to define, delete and list monitors. And that's it.

Now, we want the list of monitors to completely describe the environment, and yet we don't want existing tools to break completely. So, we need some way to automatically construct monitors from the existing RandR state while still letting the user override portions of it as needed to explain virtual or tiled outputs.

So, what I did was to let the client specify a list of outputs for each monitor. All of the CRTCs which aren't associated with an output in any client-defined monitor are then added to the list of monitors reported back to clients. That means that clients need only define monitors for things they understand, and they can leave the other bits alone and the server will do something sensible.

The second tricky bit is that if you specify an empty rectangle at 0,0 for the pixel geometry, then the server will automatically compute the geometry using the list of outputs provided. That means that if any of those outputs get disabled or reconfigured, the Monitor associated with them will appear to change as well.

Current Status

Gtk+ has been switched to use RandR for RandR versions 1.3 or later. Locally, I hacked libXrandr to override the RandR version through an environment variable, set that to 1.2 and Gtk+ happily reverts back to Xinerama and things work fine. I suspect the plan here will be to have it use the new Monitors when present as those provide the same info that it was pulling out of RandR's CRTCs.

KDE appears to still use Xinerama data for this, so it "just works".

Where's the code

As usual, all of the code for this is in a collection of git repositories in my home directory on fd.o:

git://people.freedesktop.org/~keithp/randrproto master
git://people.freedesktop.org/~keithp/libXrandr master
git://people.freedesktop.org/~keithp/xrandr master
git://people.freedesktop.org/~keithp/xserver randr-monitors

RandR protocol changes

Here's the new sections added to randrproto.txt

                  ❧❧❧❧❧❧❧❧❧❧❧

1.5. Introduction to version 1.5 of the extension

Version 1.5 adds monitors

 • A 'Monitor' is a rectangular subset of the screen which represents
   a coherent collection of pixels presented to the user.

 • Each Monitor is be associated with a list of outputs (which may be
   empty).

 • When clients define monitors, the associated outputs are removed from
   existing Monitors. If removing the output causes the list for that
   monitor to become empty, that monitor will be deleted.

 • For active CRTCs that have no output associated with any
   client-defined Monitor, one server-defined monitor will
   automatically be defined of the first Output associated with them.

 • When defining a monitor, setting the geometry to all zeros will
   cause that monitor to dynamically track the bounding box of the
   active outputs associated with them

This new object separates the physical configuration of the hardware
from the logical subsets  the screen that applications should
consider as single viewable areas.

1.5.1. Relationship between Monitors and Xinerama

Xinerama's information now comes from the Monitors instead of directly
from the CRTCs. The Monitor marked as Primary will be listed first.

                  ❧❧❧❧❧❧❧❧❧❧❧

5.6. Protocol Types added in version 1.5 of the extension

MONITORINFO { name: ATOM
          primary: BOOL
          automatic: BOOL
          x: INT16
          y: INT16
          width: CARD16
          height: CARD16
          width-in-millimeters: CARD32
          height-in-millimeters: CARD32
          outputs: LISTofOUTPUT }

                  ❧❧❧❧❧❧❧❧❧❧❧

7.5. Extension Requests added in version 1.5 of the extension.

┌───
    RRGetMonitors
    window : WINDOW
     ▶
    timestamp: TIMESTAMP
    monitors: LISTofMONITORINFO
└───
    Errors: Window

    Returns the list of Monitors for the screen containing
    'window'.

    'timestamp' indicates the server time when the list of
    monitors last changed.

┌───
    RRSetMonitor
    window : WINDOW
    info: MONITORINFO
└───
    Errors: Window, Output, Atom, Value

    Create a new monitor. Any existing Monitor of the same name is deleted.

    'name' must be a valid atom or an Atom error results.

    'name' must not match the name of any Output on the screen, or
    a Value error results.

    If 'info.outputs' is non-empty, and if x, y, width, height are all
    zero, then the Monitor geometry will be dynamically defined to
    be the bounding box of the geometry of the active CRTCs
    associated with them.

    If 'name' matches an existing Monitor on the screen, the
    existing one will be deleted as if RRDeleteMonitor were called.

    For each output in 'info.outputs, each one is removed from all
    pre-existing Monitors. If removing the output causes the list of
    outputs for that Monitor to become empty, then that Monitor will
    be deleted as if RRDeleteMonitor were called.

    Only one monitor per screen may be primary. If 'info.primary'
    is true, then the primary value will be set to false on all
    other monitors on the screen.

    RRSetMonitor generates a ConfigureNotify event on the root
    window of the screen.

┌───
    RRDeleteMonitor
    window : WINDOW
    name: ATOM
└───
    Errors: Window, Atom, Value

    Deletes the named Monitor.

    'name' must be a valid atom or an Atom error results.

    'name' must match the name of a Monitor on the screen, or a
    Value error results.

    RRDeleteMonitor generates a ConfigureNotify event on the root
    window of the screen.

                  ❧❧❧❧❧❧❧❧❧❧❧
December 16, 2014
So kernel version 3.18 is out the door and it's time for our regular look at what's in the next merge window.
First looking at new hardware the big item is basic Skylake support. There are still a few smalls things missing, but mostly it's there now. This has been contributed by Damien, Satheeshakrishna and a lot of other folks. Looking at other platforms there has also been a lot of changes for vlv/chv: Improved backlight code, completely refactored interrupt handling to bring it in line with other platforms, rewritten panel power sequencing code, all from Ville. Rodrigo contributed PSR support for vlv/chv together with a lot of other fixes for PSR. Unfortunately it's not yet again enabled by default.

Moving on to Broadwell and the render side of things, Mika and Arun provided patches to improve the render workaround code and bring the set of workarounds up to date. execlist (the new command submission support on Gen8+) is also being polished with the addition of on-demand pinning of context objects with patches from Thomas Daniel and Oscar Mateo. Finally the RPS/render-turbo code has seen a lot of polish from Imre with a few fixes from Tom O'Rourke.

Otherwise not a lot of really big things happened on the GEM side: Just a few patches to fix issues in ppgtt (unfortunately still not enabled by default anywhere due to fun with context switches). And there's a bit of prep work and reorg all over for new stuff landing hopefully soon.

Looking at overall infrastructure changes the big thing certainly is the preparations for atomic display updates. The drm core/driver interface for atomic and all the helper library code to convert drivers has landed in 3.19, and already some conversions. On the Intel side it's been just prep work under the hood thus far with patches from Ander to precompute display PLL state. The new code to use vblank evades for pagelips has also landed, which is needed for atomic plane updates. And prep patches from Gustavo Padovan started to split the low-level plane update functions into check and commit steps. Lots more patches from different people are in flight and some have been merged for 3.20 already.

Besides these driver internal changes for atomic there has been other work to improve the codebase: Imre reorganized our handlers for suspend, resume and thawing and freezing. Jani reworked the audio and eld code which is the gfx side of the puzzle needed to make audio over HDMI or DP work. Jesse provided patches to track infoframes more accurately, which is needed to correctly fastboot (i.e. without modesets if possible) on external screens.

For older machines Ville has spent a few spare cycles to make them more useful: GPU reset support for gen3/4 should mitigate some of the recent chromium crashes on mesa, and the modeset code on i830M might work correctly for the first time, ever.


And of course the usual pile of smaller fixes and improvements all over.

Not directly related to code or features is the start of documenting i915 driver internals: With this release we now have some of the interrupt handling, fifo underrun reporting, frontbuffer tracking and runtime pm support newly document. And there's lots more in-flight, so hopefully soonish this will be fairly useful.

Those running Fedora Rawhide or GNOME 3.12 may have noticed that there is no Xorg.log file anymore. This is intentional, gdm now starts the X server so that it writes the log to the systemd journal. Update 29 Mar 2014: The X server itself has no capabilities for logging to the journal yet, but no changes to the X server were needed anyway. gdm merely starts the server with a /dev/null logfile and redirects stdin/stderr to the journal.

Thus, to get the log file use journalctl, not vim, cat, less, notepad or whatever your $PAGER was before.

This leaves us with the following commands.


journalctl -e _COMM=Xorg
Which would conveniently show something like this:

Mar 25 10:48:41 yabbi Xorg[5438]: (II) UnloadModule: "wacom"
Mar 25 10:48:41 yabbi Xorg[5438]: (II) evdev: Lenovo Optical USB Mouse: Close
Mar 25 10:48:41 yabbi Xorg[5438]: (II) UnloadModule: "evdev"
Mar 25 10:48:41 yabbi Xorg[5438]: (II) evdev: Integrated Camera: Close
Mar 25 10:48:41 yabbi Xorg[5438]: (II) UnloadModule: "evdev"
Mar 25 10:48:41 yabbi Xorg[5438]: (II) evdev: Sleep Button: Close
Mar 25 10:48:41 yabbi Xorg[5438]: (II) UnloadModule: "evdev"
Mar 25 10:48:41 yabbi Xorg[5438]: (II) evdev: Video Bus: Close
Mar 25 10:48:41 yabbi Xorg[5438]: (II) UnloadModule: "evdev"
Mar 25 10:48:41 yabbi Xorg[5438]: (II) evdev: Power Button: Close
Mar 25 10:48:41 yabbi Xorg[5438]: (II) UnloadModule: "evdev"
Mar 25 10:48:41 yabbi Xorg[5438]: (EE) Server terminated successfully (0). Closing log file.
The -e toggle jumps to the end and only shows 1000 lines, but that's usually enough. journalctl has a bunch more options described in the journalctl man page. Note the PID in square brackets though. You can easily limit the output to just that PID, which makes it ideal to attach to the log to a bug report.

journalctl _COMM=Xorg _PID=5438
Previously the server kept only a single backup log file around, so if you restarted twice after a crash, the log was gone. With the journal it's now easy to extract the log file from that crash five restarts ago. It's almost like the future is already here.

Update 16/12/2014: This post initially suggested to use journactl /usr/bin/Xorg. Using _COMM is path-independent.

Fedora 21

Added 16/12/2014: If you recently updated to/installed Fedora 21 you'll notice that the above command won't show anything. As part of the Xorg without root rights feature Fedora ships a wrapper script as /usr/bin/Xorg. This script eventually executes /usr/libexecs/Xorg.bin which is the actual X server binary. Thus, on Fedora 21 replace Xorg with Xorg.bin:


journalctl -e _COMM=Xorg.bin
journalctl _COMM=Xorg.bin _PID=5438
Note that we're looking into this so that in a few updates time we don't have a special command here.

December 13, 2014

Present and Compositors

The current Present extension is pretty unfriendly to compositing managers, causing an extra frame of latency between the applications operation and the scanout buffer. Here's how I'm fixing that.

An extra frame of lag

When an application uses PresentPixmap, that operation is generally delayed until the next vblank interval. When using X without composting, this ensures that the operation will get started in the vblank interval, and, if the rendering operation is quick enough, you'll get the frame presented without any tearing.

When using a compositing manager, the operation is still delayed until the vblank interval. That means that the CopyArea and subsequent Damage event generation don't occur until the display has already started the next frame. The compositing manager receives the damage event and constructs a new frame, but it also wants to avoid tearing, so that frame won't get displayed immediately, instead it'll get delayed until the next frame, introducing the lag.

Copy now, complete later

While away from the keyboard this morning, I had a sudden idea -- what if we performed the CopyArea and generated Damage right when the PresentPixmap request was executed but delayed the PresentComplete event until vblank happened.

With the contents updated and damage delivered, the compositing manager can immediately start constructing a new scene for the upcoming frame. When that is complete, it can also use PresentPixmap (either directly or through OpenGL) to queue the screen update.

If it's fast enough, that will all happen before vblank and the application contents will actually appear at the desired time.

Now, at the appointed vblank time, the PresentComplete event will get delivered to the client, telling it that the operation has finished and that its contents are now on the screen. If the compositing manager was quick, this event won't even be a lie.

We'll be lying less often

Right now, the CopyArea, Damage and PresentComplete operations all happen after the vblank has passed. As the compositing manager delays the screen update until the next vblank, then every single PresentComplete event will have the wrong UST/MSC values in it.

With the CopyArea happening immediately, we've a pretty good chance that the compositing manager will get the application contents up on the screen at the target time. When this happens, the PresentComplete event will have the correct values in it.

How can we do better?

The only way to do better is to have the PresentComplete event generated when the compositing manager displays the frame. I've talked about how that should work, but it's a bit twisty, and will require changes in the compositing manager to report the association between their PresentPixmap request and the applications' PresentPixmap requests.

Where's the code

I've got a set of three patches, two of which restructure the existing code without changing any behavior and a final patch which adds this improvement. Comments and review are encouraged, as always!

git://people.freedesktop.org/~keithp/xserver.git present-compositor
December 08, 2014
As the development window for GNOME 3.16 advances, I've been adding a few new developer features, selfishly, so I could use them in my own programs.

Connectivity support for applications

Picking up from where Dan Winship left off, we've merged support for application to detect the network availability, especially the "connected to a network but not to the Internet" case.

In glib/gio now, watch the value of the "connectivity" property in GNetworkMonitor.

Grilo automatic network awareness

This glib/gio feature allows us to show/hide Grilo sources from applications' view if they require Internet and LAN access to work. This should be landing very soon, once we've made the new feature optional based on the presence of the new GLib.

Totem

And finally, this means we'll soon be able to show a nice placeholder when no network connection is available, and there are no channels left.

Grilo Lua resources support

A long-standing request, GResources support has landed for Grilo Lua plugins. When a script is loaded, we'll look for a separate GResource file with ".gresource" as the suffix, and automatically load it. This means you can use a local icon for sources with the URL "resource:///org/gnome/grilo/foo.png". Your favourite Lua sources will soon have icons!

Grilo Opensubtitles plugin

The developers affected by this new feature may be a group of one, but if the group is ever to expand, it's the right place to do it. This new Grilo plugin will fetch the list of available text subtitles for specific videos, given their "hashes", which are now exported by Tracker.

GDK-Pixbuf enhancements

I can point you to the NEWS file for the latest version, but the main gains are that GIF animations won't eat all your memory, DPI metadata support in JPEG, PNG and TIFF formats, and, for image viewers, you can tell whether a TIFF file is multi-page to open it in a more capable viewer.

Batched inserts, and better filters in GOM

Does what it says on the tin. This is useful for populating the database quicker than through piecemeal inserts, it also means you don't need to chain inserts when inserting multiple items.

Mathieu also worked on fixing the priority of filters when building complex queries, as well as supporting more than 2 items in a filter ("foo OR bar OR baz" for example).
December 06, 2014

click here to jump to the instructions

Mice have an optical sensor that tells them how far they moved in "mickeys". Depending on the sensor, a mickey is anywhere between 1/100 to 1/8200 of an inch or less. The current "standard" resolution is 1000 DPI, but older mice will have 800 DPI, 400 DPI etc. Resolutions above 1200 DPI are generally reserved for gaming mice with (usually) switchable resolution and it's an arms race between manufacturers in who can advertise higher numbers.

HW manufacturers are cheap bastards so of course the mice don't advertise the sensor resolution. Which means that for the purpose of pointer acceleration there is no physical reference. That delta of 10 could be a millimeter of mouse movement or a nanometer, you just can't know. And if pointer acceleration works on input without reference, it becomes useless and unpredictable. That is partially intended, HW manufacturers advertise that a lower resolution will provide more precision while sniping and a higher resolution means faster turns while running around doing rocket jumps. I personally don't think that there's much difference between 5000 and 8000 DPI anymore, the mouse is so sensitive that if you sneeze your pointer ends up next to Philae. But then again, who am I to argue with marketing types.

For us, useless and unpredictable is bad, especially in the use-case of everyday desktops. To work around that, libinput 0.7 now incorporates the physical resolution into pointer acceleration. And to do that we need a database, which will be provided by udev as of systemd 218 (unreleased at the time of writing). This database incorporates the various devices and their physical resolution, together with their sampling rate. udev sets the resolution as the MOUSE_DPI property that we can read in libinput and use as reference point in the pointer accel code. In the simplest case, the entry lists a single resolution with a single frequency (e.g. "MOUSE_DPI=1000@125"), for switchable gaming mice it lists a list of resolutions with frequencies and marks the default with an asterisk ("MOUSE_DPI=400@50 800@50 *1000@125 1200@125"). And you can and should help us populate the database so it gets useful really quickly.

How to add your device to the database

We use udev's hwdb for the database list. The upstream file is in /usr/lib/udev/hwdb.d/70-mouse.hwdb, the ruleset to trigger a match is in /usr/lib/udev/rules.d/70-mouse.rules. The easiest way to add a match is with the libevdev mouse-dpi-tool (version 1.3.2). Run it and follow the instructions. The output looks like this:


$ sudo ./tools/mouse-dpi-tool /dev/input/event8
Mouse Lenovo Optical USB Mouse on /dev/input/event8
Move the device along the x-axis.
Pause 3 seconds before movement to reset, Ctrl+C to exit.
Covered distance in device units: 264 at frequency 125.0Hz | |^C
Estimated sampling frequency: 125Hz
To calculate resolution, measure physical distance covered
and look up the matching resolution in the table below
16mm 0.66in 400dpi
11mm 0.44in 600dpi
8mm 0.33in 800dpi
6mm 0.26in 1000dpi
5mm 0.22in 1200dpi
4mm 0.19in 1400dpi
4mm 0.17in 1600dpi
3mm 0.15in 1800dpi
3mm 0.13in 2000dpi
3mm 0.12in 2200dpi
2mm 0.11in 2400dpi

Entry for hwdb match (replace XXX with the resolution in DPI):
mouse:usb:v17efp6019:name:Lenovo Optical USB Mouse:
MOUSE_DPI=XXX@125
Take those last two lines, add them to a local new file /etc/udev/hwdb.d/71-mouse.hwdb. Rebuild the hwdb, trigger it, and done:

$ sudo udevadm hwdb --update
$ sudo udevadm trigger /dev/input/event8
Leave out the device path if you're not on systemd 218 yet. Check if the property is set:

$ udevadm info /dev/input/event8 | grep MOUSE_DPI
E: MOUSE_DPI=1000@125
And that shows everything worked. Restart X/Wayland/whatever uses libinput and you're good to go. If it works, double-check the upstream instructions, then file a bug against systemd with those two lines and assign it to me.

Trackballs are a bit hard to measure like this, my suggestion is to check the manufacturer's website first for any resolution data.

Update 2014/12/06: trackball comment added, udevadm trigger comment for pre 218

December 02, 2014

Disclaimer: Limba is stilllimba-small in a very early stage of development. Bugs happen, and I give to guarantees on API stability yet.

Limba is a very simple cross-distro package installer, utilizing OverlayFS found in recent Linux kernels (>= 3.18).

As example I created a small Limba package for one of the Qt5 demo applications, and I would like to share the process of creating Limba packages – it’s quite simple, and I could use some feedback on how well the resulting packages work on multiple distributions.

I assume that you have compiled Limba and installed it – how that is done is described in its README file. So, let’s start.

1. Prepare your application

The cool thing about Limba is that you don’t really have to do many changes on your application. There are a few things to pay attention to, though:

  • Ensure the binaries and data are installed into the right places in the directory hierarchy. Binaries must go to $prefix/bin, for example.
  • Ensure that configuration can be found under /etc as well as under $prefix/etc

This needs to be done so your application will find its data at runtime. Additionally, you need to write an AppStream metadata file, and find out which stuff your application depends on.

2. Create package metadata & install software

1.1 Basics

Now you can create the metadata necessary to build a Limba package. Just run

cd /path/to/my/project
lipkgen make-template

This will create a “lipkg” directory, containing a “control” file and a “metainfo.xml” file, which can be a symlink to the AppStream metadata, or be new metadata.

Now, configure your application with /opt/bundle as install prefix (-DCMAKE_INSTALL_PREFIX=/opt/bundle, –prefix=/opt/bundle, etc.) and install it to the lipkg/inst_target directory.

1.2 Handling dependencies

If your software has dependencies on other packages, just get the Limba packages for these dependencies, or build new ones. Then place the resulting IPK packages in the lipkg/repo directory. Ideally, you should be able to fetch Limba packages which contain the software components directly from their upstream developers.

Then, open the lipkg/control file and adjust the “Requires” line. The names of the components you depend on match their AppStream-IDs (<id/> tag in the AppStream XML document). Any version-relation (>=, >>, <<, <=, <>) is supported, and specified in brackets after the component-id.

The resulting control-file might look like this:

Format-Version: 1.0

Requires: Qt5Core (>= 5.3), Qt5DBus (>= 5.3), libpng12

If the specified dependencies are in the repo/ subdirectory, these packages will get installed automatically, if your application package is installed. Otherwise, Limba depends on the user to install these packages manually – there is no interaction with the distribution’s package-manager (yet?).

3. Building the package

In order to build your package, make sure the content in inst_target/ is up to date, then run

lipkgen build lipkg/

This will build your package and output it in the lipkg/ directory.

4. Testing the package

You can now test your package, Just run

sudo lipa install package.ipk

Your software should install successfully. If you provided a .desktop file in $prefix/share/applications, you should find your application in your desktop’s application-menu. Otherwise, you can run a binary from the command-line, just append the version of your package to the binary name (bash-comletion helps). Alternatively, you can use the runapp command, which lets you run any binary in your bundle/package, which is quite helpful for debugging (since the environment a Limba-installed application is run is different from the one of other applications).

Example:

runapp ${component_id}-${version}:/bin/binary-name

And that’s it! :-)

I used these steps to create a Limba package for the OpenGL Qt5 demo on Tanglu 2 (Bartholomea), and tested it on Kubuntu 15.04 (Vivid) with KDE, as well as on an up-to-date Fedora 21, with GNOME and without any Qt or KDE stuff installed:

qt5demo-limba-kubuntuqt5demo-limba-fedora

I encountered a few obstacles when building the packages, e.g. Qt5 initially didn’t find the right QPA plugin – that has been fixed by adjusting a config file in the Qt5Gui package. Also, on Fedora, a matching libpng was missing, so I included that as well.

You can find the packages at Github, currently (but I am planning to move them to a different place soon). The biggest issue with Limba is at time, that it needs Linux 3.18, or an older kernel with OverlayFS support compiled in. Apart from that and a few bugs, the experience is quite smooth. As soon as I am sure there are now hidden fundamental issues, I can think of implementing more features, like signing packages and automatically updating them.

Have fun playing around with Limba!

December 01, 2014

A long-standing and unfixable problem in X is that we cannot send a number of keys to clients because their keycode is too high. This doesn't affect any of the normal keys for typing, but a lot of multimedia keys, especially "newly" introduced ones.

X has a maximum keycode 255, and "Keycodes lie in the inclusive range [8,255]". The reason for the offset 8 keeps escaping me but it doesn't matter anyway. Effectively, it means that we are limited to 247 keys per keyboard. Now, you may think that this would be enough and thus the limit shouldn't really affect us. And you're right. This post explains why it is a problem nonetheless.

Let's discard any ideas about actually increasing the limit from 8 bit to 32 bit. It's hardwired into too many open-coded structs that this is simply not an option. You'd be breaking every X client out there, so at this point you might as well rewrite the display server and aim for replacing X altogether. Oh wait...

So why aren't 247 keycodes enough? The reason is that large chunks of that range are unused and wasted.

In X, the keymap is an array in the form keysyms[keycode] = some keysym (that's a rather simplified view, look at the output from "xkbcomp -xkb $DISPLAY -" for details). The actual value of the keycode doesn't matter in theory, it's just an index. Of course, that theory only applies when you're looking at one keyboard at a time. We need to ship keymaps that are useful everywhere (see xkeyboard-config) and for that we need some sort of standard. In the olden days this meant every vendor had their own keycodes (see /usr/share/X11/xkb/keycodes) but these days Linux normalizes it to evdev keycodes. So we know that KEY_VOLUMEUP is always 115 and we can hook it up thus to just work out of the box. That however leaves us with huge ranges of unused keycodes because every device is different. My keyboard does not have a CD eject key, but it has volume control keys. I have a key to start a web browser but I don't have a key to start a calculator. Others' keyboards do have those keys though, and they expect those keys to work. So the default keymap needs to map the possible keycodes to the matching keysyms and suddenly our 247 keycodes per keyboard becomes 247 for all keyboards ever made. And that is simply not enough.

To work around this, we'd need locally hardware-adjusted keymaps generated at runtime. After loading the driver we can look at the keys that exist, remap higher keycodes into an unused range and then communicate that to move the keysyms into the newly mapped keycodes. This is...complicated. evdev doesn't know about keymaps. When gnome-settings-daemon applied your user-specific layout, evdev didn't get told about this. GNOME on the other hand has no idea that evdev previously re-mapped higher keycodes. So when g-s-d applies your setting, it may overwrite the remapped keysym with the one from the default keymaps (and evdev won't notice).

As usual, none of this is technically unfixable. You could figure out a protocol extension that drivers can talk to the servers and the clients to notify them of remapped keycodes. This of course needs to be added to evdev, the server, libX11, probably xkbcomp and libxkbcommon and of course to all desktop environments that set they layout. To write the patches you need a deep understanding of XKB which would definitely make your skillset a rare one, probably make you quite employable and possibly put you on the fast track for your nearest mental institution. XKB and happiness don't usually go together, but at least the jackets will keep you warm.

Because of the above, we go with the simple answer: "X can't handle keycodes over 255"