planet.freedesktop.org
February 16, 2018
When I designed virgl I added a capability system to pass some info about the host GL to the guest driver along the lines of gallium caps. The design was at the virtio GPU level you have a number of capsets each of which has a max version and max size.

The virgl capset is capset 1 with max version 1 and size 308 bytes.

Until now we've happily been using version 1 at 308 bytes. Recently we decided we wanted to have a v2 at 380 bytes, and the world fell apart.

It turned out there is a bug in the guest kernel driver, it asks the host for a list of capsets and allows guest userspace to retrieve from it. The guest userspace has it's own copy of the struct.

The flow is:
Guest mesa driver gives kernel a caps struct to fill out for capset 1.
Kernel driver asks the host over virtio for latest capset 1 info, max size, version.
Host gives it the max_size, version for capset 1.
Kernel driver asks host to fill out malloced memory of the max_size with the
caps struct.
Kernel driver copies the returned caps struct to userspace, using the size of the returned host struct.

The bug is the last line, it uses the size of the returned host struct which ends up corrupting the guest in the scenario where the host has a capset 1 v2, size 380, but the host is still running old userspace which understands capset v1, size 308.

The 380 bytes gets memcpy over the 308 byte struct and boom.

Now we can fix the kernel to not do this, but we can't upgrade every kernel in an existing VM. So if we allow the virglrenderer process to expose a v2 all older sw will explode unless it is also upgraded which isn't really something you want in a VM world.

I came up with some virglrenderer workarounds, but due to another bug where qemu doesn't reset virglrenderer when it should, there was no way to make it reliable, and things like kexec old kernel from new kernel would blow up.

I decided in the end to bite the bullet and just make capset 2 be a repaired one. Unfortunately this needs patches in all 4 components before it can be used.

1) virglrenderer needs to expose capset 2 with the new version/size to qemu.
2) qemu needs to allow the virtio-gpu to transfer capset 2 as a virgl capset to the host.
3) The kernel on the host needs fixing to make sure we copy the minimum of the host caps and the guest caps into the guest userspace driver, then it needs to
provide a way that guest userspace knows the fixed version is in place.
4) The guest userspace needs to check if the guest kernel has the fix, and then query capset 2 first, and fallback to querying capset 1.

After talking to a few other devs in virgl land, they pointed out we could probably just never add a new version of capset 2, and grow the struct endlessly.

The guest driver would fill out the struct it wants to use with it's copy of default minimum values.
It would then call the kernel ioctl to copy over the host caps. The kernel ioctl would copy the minimum size of the host caps and the guest caps.

In this case if the host has a 400 byte capset 2, and the guest still only has 380 byte capset 2, the new fields from the host won't get copied into the guest struct
and it will be fine.

If the guest has the 400 byte capset 2, but the host only has the 380 byte capset 2, the guest would preinit the extra 20 bytes with it's default values (0 or whatever) and the kernel would only copy 380 bytes into the start of the 400 bytes and leave the extra bytes alone.

Now I just have to got write the patches and confirm it all.

Thanks to Stephane at google for creating the patch that showed how broken it was, and to others in the virgl community who noticed how badly it broke old guests! Now to go write the patches...
February 14, 2018
Hi All,

First of Thank you to everyone who has been sending me PSR test results, I've received well over a 100 reports!

Quite a few testers have reported various issues when enabling PSR, 3 often reported issues are:

  • flickering

  • black screen

  • cursor / input lag

The Intel graphics team has been working on a number of fixes which make PSR work better in various cases. Note we don't expect this to fix it everywhere, but it should get better and work on more devices in the near future.

This is good news, but the bad news is that this means that all the tests people have so very kindly done for me will need to be redone once the new improved PSR code is ready for testing. I will do a new blogpost (and email people who have send me test-reports), when the new
PSR code is ready for people to (re-)test (sorry).

Regards,

Hans
February 13, 2018
zsh: corrupt history file /home/$USER/.zsh_history

Most zsh user will have seen the above line at one time or another.
And it means that re-using your shell history is no longer possible.

Maybe some of it can be recovered, but more than likely some has been lost. And even if nothing important has been lost, you probably don't want to spend any time dealing with this.

Make zsh maintain a backup

Run this snippet in the terminal of your choice.

cat <<EOT>> ~/.zshrc

# Backup and restore ZSH history
strings ~/.zsh_history | sed ':a;N;$!ba;s/\\\\\n//g' | sort | uniq -u > ~/.zsh_history.backup
cat ~/.zsh_history ~/.zsh_history.backup | sed ':a;N;$!ba;s/\\\\\n//g'| sort | uniq > ~/.zsh_history

EOT

What does this actually do?

The snippet …

February 12, 2018
History

As a developer, there are always those projects when it is hard to find a way to go forward.  Drop the project for now and find another project, if only to rest your eyes and find yourself a new insight for the temporarily abandoned project.  This is how I embarked on posix_spawn() as an actual system call you will find in Oracle Solaris 11.4. The original library implementation of posix_spawn() uses vfork(), but why care about the old address space if you are not going to use it? Or, worse, stop all the other threads in the process and don't start them until exec succeeded or when you call exit()?

As I had already written kernel modules for nefarious reason to run executables directly from the kernel, I decided to benchmark the simple "make process, execute /bin/true" against posix_spawn() from the library. Even with two threads, posix_spawn() scaled poorly: additional threads did not allow a large number of additional spawns per second.

Starting a new process

All ways to start a new process need to copy a number of process properties: file descriptors, credentials, priorities, resource controls, etc.

The original way to start a new process is fork(); you will need to mark all the pages as copy-on-write (O(n) in the size of the number of pages in the process) and so this gets more and more expensive when the process get larger and larger. In Solaris we also reserve all the needed swap; a large process calling fork() doubles its swap requirement.

In BSD vfork() was introduced; it borrows the address space and was cheap when it was invented.  In much larger processes with hundreds of threads, it became more and more of bottleneck.  Dynamic linking also throws a spanner in the works: what you can do between vfork() and the final exec() is extremely small.

In the standard universe, posix_spawn() was invented; it was aimed mostly at small embedded systems and a very number of specific actions can be performed before the new executable is run.  As it was part of the standard, Solaris grew its own copy build on top of vfork(). It has, of course, the same problems as vfork() has; but because it is implemented in the library we can be sure we steer clear from all the other vfork() pitfalls.

Native spawn(2) call

The native spawn(2) system introduced in Oracle Solaris 11.4 shares a lot of code with the forkx(2) and execve(2).  It mostly avoids doing those unneeded operations:

  • do not stop all threads
  • do not copy any data about the current executable
  • do not clear all watch points (vfork())
  • do not duplicate the address space (fork())
  • no need to handle shared memory segments
  • do not copy one or more of the threads (fork1/forkall), create a new one instead
  • do not copy all file pointers
  • no need to restart all threads held earlier

The exec() call copies from its own address space but when spawn(2) needs the argument, it is already in a new process.  So early in the spawn(2) system call we copy the environment vector and the arguments and save them away.  The data blob is given to the child and the parent waits until the client is about to return from the system call in the new process or when it decides that it can't actually exec and calls exit instead.

A process can spawn(2) in all its threads and the concurrently is only limited by locks that need to be held shortly when processes are created.

The performance win depends on the application; you won't win anything unless you use posix_spawn(); I was very happy to see that our standard shell is using posix_spawn() to start new processes as do popen(3c) as well as system(3c) so the call is well tested.  The more threads you have, the bigger the win. Stopping a thread is expensive, especially if it hold up in a system call. The world used to stop but now it just continues.

Support in truss(1), mdb(1)

When developing a new system call special attention needs to be given to proc(5) and truss(1) interaction.  The spawn(2) system call is an exception but only because it is much harder to get it right; support is also needed in debuggers or they won't see a new process starting. This includes mdb(1) but also truss(1).  They also need to learn that when spawn(2) succeeds, that they are stopping in a completely different executable; we may also have crossed a privilege boundary, e.g., when spawning su(8) or ping(8).

I spent the end of January gearing up for LCA, where I gave a talk about what I’ve done in Broadcom graphics since my last LCA talk 3 years earlier. Video is here.

(Unfortunately, I failed to notice the time countdown, so I didn’t make it to my fun VC5 demo, which had to be done in the hallway after)

I then spent the first week of February in Cambridge at the Raspberry Pi office working on vc4. The goal was to come up with a plan for switching to at least the “fkms” mode with no regressions, with a route to full KMS by default.

The first step was just fixing regressions for fkms in 4.14. The amusing one was mouse lag, caused by us accidentally syncing mouse updates to vblank, and an old patch to reduce HID device polling to ~60fps having been accidentally dropped in the 4.14 rebase. I think we should be at parity-or-better compared to 4.9 now.

For full KMS, the biggest thing we need to fix is getting media decode / camera capture feeding into both VC4 GL and VC4 KMS. I wrote some magic shader code to turn linear Y/U/V or Y/UV planes into tiled textures on the GPU, so that they can be sampled from using GL_OES_EGL_image_external. The kmscube demo works, and working with Dave Stevenson I got a demo mostly working of H.264 decode of Big Buck Bunny into a texture in GL on X11.

While I was there, Dave kept hammering away at the dma-buf sharing work he’s been doing. Our demo worked by having a vc4 fd create the dma-bufs, and importing that into vcsm (to talk MMAL to) and into the vc4 fd used by Mesa (mmal needs the buffers to meet its own size restrictions, so VC4 GL can’t do the allocations for it). The extra vc4 fd is a bit silly – we should be able to take vcsm buffers and export them to vc4.

Also, if VCSM could do CMA allocations for us, then we could potentially have VCSM take over the role of allocating heap for the firmware, meaning that you wouldn’t need big permanent gpu_mem= memory carveouts in order for camera and video to work.

Finally, on the last day Dave got a bit distracted and put together VC4 HVS support for the SAND tiling modifier. He showed me a demo of BBB H.264 decode directly to KMS on the console, and sent me the patch. I’ll do a little bit of polish, and send it out once I get back from vacation.

We also talked about plans for future work. I need to:

  • Polish and merge the YUV texturing support.
  • Extend the YUV texturing support to import SAND-layout buffers with no extra copies (I suspect this will be higher performance media decode into GL than the closed driver stack offered).
  • Make a (downstream-only) QPU user job submit interface so that John Cox’s HEVC decoder can cooprate with the VC4 driver to do deinterlace. (I have a long term idea of us shipping the deinterlace code as a “firmware” blob from the Linux kernel’s perspective and using that blessed blob to securely do deinterlace in the upstream kernel.
  • Make an interface for the firmware to request a QPU user job submission from VC4, so that the firmware’s fancy camera AWB algorithm can work in the presence of the VC4 driver (right now I believe we fall back to a simpler algorithm on the VPU).
  • Investigate reports of slow PutImage-style uploads from SDL/emulators/etc.

Dave plans to:

  • Finish the VCSM rewrite to export dma-bufs and not need gpu_mem= any more.
  • Make a dma-buf enabled V4L2 mem2mem driver for H.264 decode, JPEG decode, etc. using MMAL and VCSM.

Someone needs to:

  • Use the writeback connector in X to implement rotation (which should be cheaper than using GL to do so).
  • Backdoor the dispmanx library in Raspbian to talk KMS instead when the full vc4 KMS driver is loaded (at least on the console. Maybe with some simple support for X11?).

Finally, other little updates:

  • I ported Mesa to V3D 4.2
  • Fixed some GLES3 conformance bugs for VC5
  • Fixed 3D textures for VC5
  • Worked with Boris on debugging HDMI failures in KMS, and reviewed his patches. Finally the flip_done timeouts should be gone!
February 11, 2018
As is usually the case, I'm long overdue for an update.  So this covers the last six(ish) months or so.  The first part might be old news if you follow phoronix.

Older News

In the last update, I mentioned basic a5xx compute shader support.  Late last year (and landing in the mesa 18.0 branch) I had a chance to revisit compute support for a5xx, and finished:
  • image support
  • shared variable support
  • barriers, which involved some improvements to the ir3 instruction scheduler so barriers could be scheduled in the correct order (ie. for various types of barriers, certain instructions can't be move before/after the related barrier
There were also some semi-related SSBO fixes, and additional r/e of instruction encodings, in particular for barriers (new cat7 group of instructions) and image vs SSBO (where different variation of the cat6 instruction encoding are used for images vs SSBOs).

Also I r/e'd and added support for indirect compute, indirect draw, texture-gather, stencil textures, and ARB_framebuffer_no_attachments on a5xx.  Which brings us pretty close to gles31 support.  And over the holiday break I r/e'd and implemented tiled texture support, because moar fps ;-)

Ilia Mirkin also implemented indirect draw, stencil texture, and ARB_framebuffer_no_attachments for a4xx.  Ilia and Wladimir J. van der Laan also landed a handful of a2xx and a20x fixes.  (But there are more a20x fixes hanging out on a branch which we still need to rebase and merge.)  It is definitely nice seeing older hw, which blob driver has long since dropped support for, getting some attention.

Other News

Not exactly freedreno related, but probably of some interest to freedreno users.. in the 4.14 kernel, my qcom_iommu driver finally landed!  This was the last piece to having the gpu working on a vanilla upstream kernel on the dragonboard 410c.  In addition, the camera driver also landed in 4.14, and venus, the v4l2 mem-to-mem driver for hw video decode/encode landed in 4.13.  (The venus driver also already has support for db820c.)

Fwiw, the v4l2 mem-to-mem driver interface is becoming the defacto standard for hw video decode/encode on SoC's.  GStreamer has had support for a long time now.  And more recently ffmpeg (v3.4) and kodi have gained support:



When I first started on freedreno, qcom support for upstream kernel was pretty dire (ie. I think serial console support might have worked on some ancient SoC).  When I started, the only kernel that I could use to get the gpu running was old downstream msm android kernels (initially 2.6.35, and on later boards 3.4 and 3.10).  The ifc6410 was the first board that I (eventually) could run an upstream kernel (after starting out with an msm-3.4 kernel), and the db410c was the first board I got where I never even used an downstream android kernel.  Initially db410c was upstream kernel with a pile of patches, although the size of the patchset dropped over time.  With db820c, that pattern is repeating again (ie. the patchset is already small enough that I managed to easily rebase it myself for after 4.14).  Linaro and qcom have been working quietly in the background to upstream all the various drivers that something like drm/msm depend on to work (clk, genpd, gpio, i2c, and other lower level platform support).  This is awesome to see, and the linaro/qcom developers behind this progress deserve all the thanks.  Without much fanfare, snapdragon has gone from a hopeless case (from upstream perspective) to one of the better supported platforms!

Thanks to the upstream kernel support, and u-boot/UEFI support which I've mentioned before, Fedora 27 supports db410c out of the box (and the situation should be similar with other distro's that have new enough kernel (and gst/ffmpeg/kodi if you care about hw video decode).  Note that the firmware for db410c (and db820c) has been merged in linux-firmware since that blog post.

More Recent News

More recently, I have been working on a batch of (mostly) compiler related enhancements to improve performance with things that have more complex shaders.  In particular:
  • Switch over to NIR's support for lowering phi-web's to registers, instead of dealing with phi instructions in ir3.  NIR has a much more sophisticated pass for coming out of SSA, which does a better job at avoiding the need to insert extra MOV instructions, although a bunch of RA (register allocation) related fixes were required.  The end result is fewer instructions in resulting shader, and more importantly a reduction in register usage.
  • Using NIR's peephole_select pass to lower if/else, instead of our own pass.  This was a pretty small change (although it took some work to arrive at a decent threshold).  Previously the ir3_nir_lower_if_else pass would try to lower all if/else to select instructions, but in extreme cases this is counter-productive as it increases register pressure.  (Background: in simple cases for a GPU, executing both sides of an if/else and using a select instruction to choose the results makes sense, since GPUs tend to be a SIMT arch, and if you aren't executing both sides, you are stalling threads in a warp that took the opposite direction in the if/else.. but in extreme cases this increases register usage which reduces the # of warps in flight.)  End result was 4x speedup in alu2 benchmark, although in the real world it tends to matter less (ie. most shaders aren't that complex).
  • Better handling of sync flags across basic blocks
  • Better instruction scheduling across basic blocks
  • Better instruction scheduling for SFU instructions (ie. sqrt, rsqrt, sin, cos, etc) to avoid stalls on SFU.
  • R/e and add support for (sat)urate flag flag (to avoid extra sequence of min.f + max.f instructions to clamp a result)
  • And a few other tweaks.
The end results tend to depend on how complex the shaders that a game/benchmark uses.  At the extreme high end, 4x improvement for alu2.  On the other hand, probably doesn't make much difference for older games like xonotic.  Supertuxkart and most of the other gfxbench benchmarks show something along the lines of 10-20% improvement.  Supertuxkart, in particular, with advanced pipeline, the combination of compiler improvements with previous lrz and tiled texture (ie. FD_MESA_DEBUG=lrz,ttile) is a 30% improvement!  Some of the more complex shaders I've been looking at, like shadertoy piano, show 25% improvement on the compiler changes alone.  (Shadertoy isn't likely to benefit from lrz/ttile since it is basically just drawing a quad with all the rendering logic in the fragment shader.)

In other news, things are starting to get interesting for snapdragon 845 (sdm845).  Initial patches for a6xx GPU support have been posted (although I still need to my hands on a6xx hw to start r/e for userspace, so those probably won't be merged soon).  And qcom has drm/msm display support buried away in their msm-4.9 tree (expect to see first round of patches for upstream soon.. it's a lot of code, so expect some refactoring before it is merged, but good to get this process started now).

February 09, 2018

For the past few years a clear trend of containerization of applications and services has emerged. Having processes containerized is beneficial in a number of ways. It both improves portability and strengthens security, and if done properly the performance penalty can be low.

In order to further improve security containers are commonly run in virtualized environments. This provides some new challenges in terms of supporting the accelerated graphics usecase.

OpenGL ES implementation

Currently Collabora and Google are implementing OpenGL ES 2.0 support. OpenGL ES 2.0 is the lowest common denominator for many mobile platforms and as such is a requirement for Virgil3D to be viable on the those platforms.

That is is the motivation for making Virgil3D work on OpenGL ES hosts.

How …

February 08, 2018

We've been getting random questions about how to install (Oracle Solaris) packages onto their newly installed Oracle Solaris 11.4 Beta. And of course key is pointing to the appropriate IPS repository.

One of the options is to download the full repository and install it on it's own locally or add this to an existing local repository and then just point the publisher to this local repository. This is mostly used by folks who have a test system/LDom/Kernel Zone where they will probably have one or more local repositories already.

However experience shows that a large percentage of folks testing a beta version like this do this in a VirtualBox instance on their laptop or workstation. And because of this they want to use the Gnome Desktop rather than remotely logging through ssh. So one of the things we do is supply an Oracle VM Template for VirtualBox which already has the solaris-desktop group package installed (officially the group/system/solaris-desktop) so it shows more than the console when started and give you the ability to run desktop like tasks like Firefox and a Terminal. (Btw as per the Release Notes on Runtime Issues there's a glitch with gnome-terminal you might run into and you'd need to run a workaround to get it working.)

For this group of VirtualBox based testers the chances are high that they're not going to have a local repository nearby, especially on a laptop that's moving around. This is where using our central repository at pkg.oracle.com is very useful which is well described in the Oracle Solaris documentation.

However going through this there may be some minor obstacles to clear when using this method that aren't directly part of the process but get in the way when using the VirtualBox installed OVM Template.

First, when using the Firefox browser to request certificates and download certificates and later point to the repository you'll need to have DNS working and depending on the install the DNS client may not yet be enabled. Here's how you check it:

demo@solaris-vbox:~$ svcs dns/client STATE STIME FMRI disabled 5:45:26 svc:/network/dns/client:default

This is fairly simple to solve. First check that the Oracle Solaris instance has correctly picked up the DNS information from VirtualBox in the DHCP process buy looking in /etc/resolv.conf. If that looks good simply enable the dns/client service:

demo@solaris-vbox:~$ sudo svcadm enable dns/client

You'll be asked for your password and then it will be enabled. Note you can also use pfexec(1) instead of sudo(8). This will also check if your user has the appropriate privileges.

You can check if the service is running:

demo@solaris-vbox:~/Downloads$ svcs dns/client STATE STIME FMRI online 10:21:16 svc:/network/dns/client:default

Now DNS is running you should be able to ping pkg.oracle.com.

The second gotya is that on the pkg-register.oracle.com page the Oracle Solaris 11.4 Beta repository is at the very bottom of the list of available repositories and should not be confused with the Oracle Solaris 11 Support repository (to which you may already have requested access) listed at the top of the page.

The same certificate/key pair are used for any of the Oracle Solaris repositories, however in order permit the use of the any existing cert/key pair the license for the Oracle Solaris 11.4 Beta repository must be accepted. This means selecting the 'Request Access' button next to the Solaris 11.4 Beta repository entry.

Once you have the cert/key, or you have accepted the license, then you can configure the beta repository as:

pkg set-publisher -k <your-key> -c <your-cert> -g https://pkg.oracle.com/solaris/beta solaris

With the Virtual Box image the default repository setup includes the 'release' repository. It is best to remove that:

pkg set-publisher -G http://pkg.oracle.com/solaris/release solaris

This can be performed in one command:

pkg set-publisher -k <your-key> -c <your-cert> -G http://pkg.oracle.com/solaris/release\ -g https://pkg.oracle.com/solaris/beta solaris

Note that here too you'll need to either use pfexec(1) or sudo(8) again. This should kickoff the pkg(1) command and once it's done you can check it's status with:

demo@solaris-vbox:~/Downloads$ pkg publisher solaris Publisher: solaris Alias: Origin URI: https://pkg.oracle.com/solaris/beta/ Origin Status: Online SSL Key: /var/pkg/ssl/xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx SSL Cert: /var/pkg/ssl/xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx Cert. Effective Date: January 29, 2018 at 03:04:58 PM Cert. Expiration Date: February 6, 2020 at 03:04:58 PM Client UUID: xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx Catalog Updated: January 24, 2018 at 02:09:16 PM Enabled: Yes

And now you're up and running.

A final thought, if for example you've chosen to install the Text Install version of the Oracle Solaris 11.4 Beta because you want yo have a nice minimal install with no overhead of Gnome and things like that, you can also download the key and certificate to another system or the hosting OS (in the case you're using VirtualBox) and then rsync or rcp them across and then follow all the same steps.

February 05, 2018

The number one use case for live migration today is for evacuation: when a Solaris Zones host needs some maintenance operation that involves a reboot, then the zones are live migrated to some other willing host. This avoids scheduling simultaneous maintenance windows for all the services provided by those zones.

Implementing this today on Solaris 11.3 involves manually migrating zones with individual zoneadm migrate commands, and especially, determining suitable destinations for each of the zones. To make this common scenario simpler and less error prone, Solaris 11.4 Beta comes with a new command sysadm(8) for system maintenance that also allows for zone evacuation.

The basic idea of how it is supposed to be used is like this:

# pkg update ... # sysadm maintain -s -m "updating to new build" # sysadm evacuate -v Evacuating 3 zones... Migrating myzone1 to rads://destination1/ ... Migrating myzone3 to rads://destination1/ ... Migrating myzone4 to rads://destination2/ ... Done in 3m30s. # reboot ... # sysadm maintain -e # sysadm evacuate -r ...

When in maintenance mode, an attempt to attach or boot any zone is refused: if the admin is trying to move zones off the host, it's not helpful to allow incoming zones. Note that this maintenance mode is recorded system-wide, not just in the zones framework; even though the only current impact is on zones, it seems likely other sub-systems may find it useful in the future.

To set up an evacuation target for a zone, an SMF property evacuation/target for a given zone service instance system/zones/zone:<zone-name> must be set to the target host. You can either use rads:// or ssh:// location identifier, e.g.: ssh://janp@mymachine.mydomain.com. Do not forget to refresh the service instance for your change to take effect.

You can evacuate running Kernel Zones and both installed native and Kernel Zones. The evacuation always means evacuating running zones, and with the option -a, installed zones are included as well. Only those zones with the set evacuation/target property in their service instance are scheduled for evacuation. However, if any of the running zone (and also installed if the evacuate -a is used) is not set with the property, the overall result of the evacuation will be reported as failed by sysadm which is logical as an evacuation by its definition means to evacuate everything.

As live zone migration does not support native zones, those can be only evacuated in the installed state. Also note that you can only evacuate zones installed on shared storage. For example, on iSCSI volumes. See the storage URI manual page, suri(7), for information on what other shared storage is supported. Note that you can install Kernel Zones to NFS files as well.

To setup live Kernel Zone migration, please check out Migrating an Oracle Solaris Kernel Zone section of the 11.4 online documentation.

Now, let's see a real example. We have a few zones on host nacaozumbi. All running and installed zones are on shared storage, including the native zone tzone1 and Kernel Zone evac1:

root:nacaozumbi:~# zonecfg -z tzone1 info rootzpool rootzpool: storage: iscsi://saison/luname.naa.600144f0dbf8af1900005582f1c90007 root:nacaozumbi::~$ zonecfg -z evac1 info device device: storage: iscsi://saison/luname.naa.600144f0dbf8af19000058ff48060017 id: 1 bootpri: 0 root:nacaozumbi:~# zoneadm list -cv ID NAME STATUS PATH BRAND IP 0 global running / solaris shared 82 evac3 running - solaris-kz excl 83 evac1 running - solaris-kz excl 84 evac2 running - solaris-kz excl - tzone1 installed /system/zones/tzone1 solaris excl - on-fixes configured - solaris-kz excl - evac4 installed - solaris-kz excl - zts configured - solaris-kz excl

Zones not set for evacution were detached - ie. on-fixes and zts. All running and installed zones are set to be evacuated to bjork, for example:

root:nacaozumbi:~# svccfg -s system/zones/zone:evac1 listprop evacuation/target evacuation/target astring ssh://root@bjork

Now, let's start the maintenance window:

root:nacaozumbi:~# sysadm maintain -s -m "updating to new build" root:nacaozumbi:~# sysadm maintain -l TYPE USER DATE MESSAGE admin root 2018-02-02 01:10 updating to new build

At this point we can no longer boot or attach zones on nacaozumbi:

root:nacaozumbi:~# zoneadm -z on-fixes attach zoneadm: zone 'on-fixes': attach prevented due to system maintenance: see sysadm(8)

And that also includes migrating zones to nacaozumbi:

root:bjork:~# zoneadm -z on-fixes migrate ssh://root@nacaozumbi zoneadm: zone 'on-fixes': Using existing zone configuration on destination. zoneadm: zone 'on-fixes': Attaching zone. zoneadm: zone 'on-fixes': attach failed: zoneadm: zone 'on-fixes': attach prevented due to system maintenance: see sysadm(8)

Now we start evacuating all the zones. In this example, all running and installed zones have their service instance property evacuation/target set. The option -a means all the zones, that is including those installed. The -v option provides verbose output.

root:nacaozumbi:~# sysadm evacuate -va sysadm: preparing 5 zone(s) for evacuation ... sysadm: initializing migration of evac1 to bjork ... sysadm: initializing migration of evac3 to bjork ... sysadm: initializing migration of evac4 to bjork ... sysadm: initializing migration of tzone1 to bjork ... sysadm: initializing migration of evac2 to bjork ... sysadm: evacuating 5 zone(s) ... sysadm: migrating tzone1 to bjork ... sysadm: migrating evac2 to bjork ... sysadm: migrating evac4 to bjork ... sysadm: migrating evac1 to bjork ... sysadm: migrating evac3 to bjork ... sysadm: evacuation completed successfully. sysadm: evac1: evacuated to ssh://root@bjork sysadm: evac2: evacuated to ssh://root@bjork sysadm: evac3: evacuated to ssh://root@bjork sysadm: evac4: evacuated to ssh://root@bjork sysadm: tzone1: evacuated to ssh://root@bjork

While being evacuated, you can check the state of evacuation like this:

root:nacaozumbi:~# sysadm evacuate -l sysadm: evacuation in progress

After the evacuation is done, you can also see the details like this (for example, in case you did not run it in verbose mode):

root:nacaozumbi:~# sysadm evacuate -l -o ZONENAME,STATE,DEST ZONENAME STATE DEST evac1 EVACUATED ssh://root@bjork evac2 EVACUATED ssh://root@bjork evac3 EVACUATED ssh://root@bjork evac4 EVACUATED ssh://root@bjork tzone1 EVACUATED ssh://root@bjork

And you can see all the evacuated zones are now in the configured state on the source host:

root:nacaozumbi:~# zoneadm list -cv ID NAME STATUS PATH BRAND IP 0 global running / solaris shared - tzone1 configured /system/zones/tzone1 solaris excl - evac1 configured - solaris-kz excl - on-fixes configured - solaris-kz excl - evac4 configured - solaris-kz excl - zts configured - solaris-kz excl - evac3 configured - solaris-kz excl - evac2 configured - solaris-kz excl

And the migrated zones are happily running or in the installed state on host bjork:

jpechane:bjork::~$ zoneadm list -cv ID NAME STATUS PATH BRAND IP 0 global running / solaris shared 57 evac3 running - solaris-kz excl 58 evac1 running - solaris-kz excl 59 evac2 running - solaris-kz excl - on-fixes installed - solaris-kz excl - tzone1 installed /system/zones/tzone1 solaris excl - zts installed - solaris-kz excl - evac4 installed - solaris-kz excl

The maintenance state is still held at this point:

root:nacaozumbi:~# sysadm maintain -l TYPE USER DATE MESSAGE admin root 2018-02-02 01:10 updating to new build

Upgrade the system with a new boot environment unless you did that before (which you should have to keep the time your zones are running on the other host to minimum):

root:nacaozumbi:~# pkg update --be-name=.... -C0 entire@... root:nacaozumbi:~# reboot

Now, finish the maintenance mode.

root:nacaozumbi:~# sysadm maintain -e

And as the final step, return all the evacuated zones now. As explained before, you would not be able to do it if still in the maintenace mode.

root:nacaozumbi:~# sysadm evacuate -ra sysadm: preparing zones for return ... 5/5 sysadm: returning zones ... 5/5 sysadm: return completed successfully.

Possible enhancements for the future we are considering include specifying multiple targets and a spread policy, with a resource utilisation comparison algorithm that would consider CPU arch, RAM and CPU resources.

This is part two in my series of posts about Solaris Analytics in the Solaris 11.4 release. You may find part one here.

The Solaris Analytics WebUI (or "bui" for short) is what we use to tie together all our data gathering from the Stats Store. Comprised of two web apps (titled "Solaris Dashboard" and "Solaris Analytics"), enable the webui service via # svcadm enable webui/server

Once the service is online, point your browser at https://127.0.0.1:6787 and log in. [Note that the self-signed certificate is that generated by your system, and adding an exception for it in your browser is fine]. Rather than roll our own toolkit, we make use of Oracle Jet, which means we can keep a consistent look and feel across Oracle web applications.

After logging in, you'll see yourself at the Oracle Solaris Web Dashboard, which shows an overview of several aspects of your system, along with Faults (FMA) and Solaris Audit activity if your user has sufficient privileges to read them.
 

 

Mousing over any of the visualizations on this page will give you a brief description of what the visualization provides, and clicking on it will take you to a more detailed page.

If you click on the hostname in the top bar (next to Applications), you'll see what we call the Host Drawer. This pulls information from svc:/system/sysstat.


Click the 'x' on the top right to close the drawer.



Selecting Applications / Solaris Analytics will take you to the main part of the bui:


I've select the NFS Client sheet, resulting in the dark shaded box on the right popping up with a description of what the sheet will show you.
 

Building blocks: faults, utilization and audit events
In the previous installment I mentioned that we wanted to provide a way for you to tie together the many sources of information we provide, so that you could answer questions about your system. This is a small example of how you can do so.

The host these screenshots were taken from is a single processor, four-core Intel-based workstation. In a terminal window I ran # psradm -f 3 Followed a few minutes later by # psradm -n 3
You can see those events marked on each of the visualizations with a blue triangle here:


Now if I mouseover the triangle marking the second offline/online pair, in the Thread Migrations viz, I can see that the system generated a Solaris Audit event:


This allows us to observe that the changes in system behaviour (primarily load average and thread migrations across cores) were correlated with the offlining of a cpu core.


Finally, let's have a look at the Audit sheet. To view the stats on this page, you need to login to the bui as a suitably-privileged user - either root, or a user with the solaris.sstore.read.sensitive privileges.

 

# usermod -A +solaris.sstore.read.sensitive $USER  

For this screenshot I not only redid the psradm operations from earlier, I also tried making an ssh connection with an unknown user, and logged in on another of this system's virtual consoles. There are many other things you could observe with the audit subsystem; this is just a glimpse:


Tune in next time for a discussion of using the C and Python bindings to the Stats Store so you can add your own statistics.

February 03, 2018

Alt text

A recording of the talk can be found here.

Downloads

If you're curious about the slides, you can download the PDF or the OTP.

Thanks

This post has been a part of work undertaken by my employer Collabora.

I would like to thank the wonderful organizers and volunteers of FOSDEM, for hosting a great community event.

Composite acceleration in the X server

One of the persistent problems with the modern X desktop is the number of moving parts required to display application content. Consider a simple PresentPixmap call as made by the Vulkan WSI or GL using DRI3:

  1. Application calls PresentPixmap with new contents for its window

  2. X server receives that call and pends any operation until the target frame

  3. At the target frame, the X server copies the new contents into the window pixmap and delivers a Damage event to the compositor

  4. The compositor responds to the damage event by copying the window pixmap contents into the next screen pixmap

  5. The compositor calls PresentPixmap with the new screen contents

  6. The X server receives that call and either posts a Swap call to the kernel or delays any action until the target frame

This sequence has a number of issues:

  • The operation is serialized between three processes with at least three context switches involved.

  • There is no traceable relation between when the application asked for the frame to be shown and when it is finally presented. Nor do we even have any way to tell the application what time that was.

  • There are at least two copies of the application contents, from DRI3 buffer to window pixmap and from window pixmap to screen pixmap.

We'd also like to be able to take advantage of the multi-plane capabilities in the display engine (where available) to directly display the application contents.

Previous Attempts

I've tried to come up with solutions to this issue a couple of times in the past.

Composite Redirection

My first attempt to solve (some of) this problem was through composite redirection. The idea there was to directly pass the Present'd pixmap to the compositor and let it copy the contents directly from there in constructing the new screen pixmap image. With some additional hand waving, the idea was that we could associate that final presentation with all of the associated redirected compositing operations and at least provide applications with accurate information about when their images were presented.

This fell apart when I tried to figure out how to plumb the necessary events through to the compositor and back. With that, and the realization that we still weren't solving problems inherent with the three-process dance, nor providing any path to using overlays, this solution just didn't seem worth pursuing further.

Automatic Compositing

More recently, Eric Anholt and I have been discussing how to have the X server do all of the compositing work by natively supporting ARGB window content. By changing compositors to place all screen content in windows, the X server could then generate the screen image by itself and not require any external compositing manager assistance for each frame.

Given that a primitive form of automatic compositing is already supported, extending that to support ARGB windows and having the X server manage the stack seemed pretty tractable. We would extend the driver interface so that drivers could perform the compositing themselves using a mixture of GPU operations and overlays.

This runs up against five hard problems though.

  1. Making transitions between Manual and Automatic compositing seamless. We've seen how well the current compositing environment works when flipping compositing on and off to allow full-screen applications to use page flipping. Lots of screen flashing and application repaints.

  2. Dealing with RGB windows with ARGB decorations. Right now, the window frame can be an ARGB window with the client being RGB; painting the client into the frame yields an ARGB result with the A values being 1 everywhere the client window is present.

  3. Mesa currently allocates buffers exactly the size of the target drawable and assumes that the upper left corner of the buffer is the upper left corner of the drawable. If we want to place window manager decorations in the same buffer as the client and not need to copy the client contents, we would need to allocate a buffer large enough for both client and decorations, and then offset the client within that larger buffer.

  4. Synchronizing window configuration and content updates with the screen presentation. One of the major features of a compositing manager is that it can construct complete and consistent frames for display; partial updates to application windows need never be shown to the user, nor does the user ever need to see the window tree partially reconfigured. To make this work with automatic compositing, we'd need to both codify frame markers within the 2D rendering stream and provide some method for collecting window configuration operations together.

  5. Existing compositing managers don't do this today. Compositing managers are currently free to paint whatever they like into the screen image; requiring that they place all screen content into windows would mean they'd have to buy in to the new mechanism completely. That could still work with older X servers, but the additional overhead of more windows containing decoration content would slow performance with those systems, making migration less attractive.

I can think of plausible ways to solve the first three of these without requiring application changes, but the last two require significant systemic changes to compositing managers. Ick.

Semi-Automatic Compositing

I was up visiting Pierre-Loup at Valve recently and we sat down for a few hours to consider how to help applications regularly present content at known times, and to always know precisely when content was actually presented. That names just one of the above issues, but when you consider the additional work required by pure manual compositing, solving that one issue is likely best achieved by solving all three.

I presented the Automatic Compositing plan and we discussed the range of issues. Pierre-Loup focused on the last problem -- getting existing Compositing Managers to adopt whatever solution we came up with. Without any easy migration path for them, it seemed like a lot to ask.

He suggested that we come up with a mechanism which would allow Compositing Managers to ease into the new architecture and slowly improve things for applications. Towards that, we focused on a much simpler problem

How can we get a single application at the top of the window stack to reliably display frames at the desired time, and to know when that doesn't occur.

Coming up with a solution for this led to a good discussion and a possible path to a broader solution in the future.

Steady-state Behavior

Let's start by ignoring how we start and stop this new mode and look at how we want applications to work when things are stable:

  1. Windows not moving around
  2. Other applications idle

Let's get a picture I can use to describe this:

In this picture, the compositing manager is triple buffered (as is normal for a page flipping application) with three buffers:

  1. Scanout. The image currently on the screen

  2. Queued. The image queued to be displayed next

  3. Render. The image being constructed from various window pixmaps and other elements.

The contents of the Scanout and Queued buffers are identical with the exception of the orange window.

The application is double buffered:

  1. Current. What it has displayed for the last frame

  2. Next. What it is constructing for the next frame

Ok, so in the steady state, here's what we want to happen:

  1. Application calls PresentPixmap with 'Next' for its window

  2. X server receives that call and copies Next to Queued.

  3. X server posts a Page Flip to the kernel with the Queued buffer

  4. Once the flip happens, the X server swaps the names of the Scanout and Queued buffers.

If the X server supports Overlays, then the sequence can look like:

  1. Application calls PresentPixmap

  2. X server receives that call and posts a Page Flip for the overlay

  3. When the page flip completes, the X server notifies the client that the previous Current buffer is now idle.

When the Compositing Manager has content to update outside of the orange window, it will:

  1. Compositing Manager calls PresentPixmap

  2. X server receives that call and paints the Current client image into the Render buffer

  3. X server swaps Render and Queued buffers

  4. X server posts Page Flip for the Queued buffer

  5. When the page flip occurs, the server can mark the Scanout buffer as idle and notify the Compositing Manager

If the Orange window is in an overlay, then the X server can skip step 2.

The Auto List

To give the Compositing Manager control over the presentation of all windows, each call to PresentPixmap by the Compositing Manager will be associated with the list of windows, the "Auto List", for which the X server will be responsible for providing suitable content. Transitioning from manual to automatic compositing can therefore be performed on a window-by-window basis, and each frame provided by the Compositing Manager will separately control how that happens.

The Steady State behavior above would be represented by having the same set of windows in the Auto List for the Scanout and Queued buffers, and when the Compositing Manager presents the Render buffer, it would also provide the same Auto List for that.

Importantly, the Auto List need not contain only children of the screen Root window. Any descendant window at all can be included, and the contents of that drawn into the image using appropriate clipping. This allows the Compositing Manager to draw the window manager frame while the client window is drawn by the X server.

Any window at all can be in the Auto List. Windows with PresentPixmap contents available would be drawn from those. Other windows would be drawn from their window pixmaps.

Transitioning from Manual to Auto

To transition a window from Manual mode to Auto mode, the Compositing Manager would add it to the Auto List for the Render image, and associate that Auto List with the PresentPixmap request for that image. For the first frame, the X server may not have received a PresentPixmap for the client window, and so the window contents would have to come from the Window Pixmap for the client.

I'm not sure how we'd get the Compositing Manager to provide another matching image that the X server can use for subsequent client frames; perhaps it would just create one itself?

Transitioning from Auto to Manual

To transition a window from Auto mode to Manual mode, the Compositing manager would remove it from the Auto List for the Render image and then paint the window contents into the render image itself. To do that, the X server would have to paint any PresentPixmap data from the client into the window pixmap; that would be done when the Compositing Manager called GetWindowPixmap.

New Messages Required

For this to work, we need some way for the Compositing Manager to discover windows that are suitable for Auto composting. Normally, these will be windows managed by the Window Manager, but it's possible for them to be nested further within the application hierarchy, depending on how the application is constructed.

I think what we want is to tag Damage events with the source window, and perhaps additional information to help Compositing Managers determine whether it should be automatically presenting those source windows or a parent of them. Perhaps it would be helpful to also know whether the Damage event was actually caused by a PresentPixmap for the whole window?

To notify the server about the Auto List, a new request will be needed in the Present extension to set the value for a subsequent PresentPixmap request.

Actually Drawing Frames

The DRM module in the Linux kernel doesn't provide any mechanism to remove or replace a Page Flip request. While this may get fixed at some point, we need to deal with how it works today, if only to provide reasonable support for existing kernels.

I think about the best we can do is to set a timer to fire a suitable time before vblank and have the X server wake up and execute any necessary drawing and Page Flip kernel calls. We can use feedback from the kernel to know how much slack time there was between any drawing and the vblank and adjust the timer as needed.

Given that the goal is to provide for reliable display of the client window, it might actually be sufficient to let the client PresentPixmap request drive the display; if the Compositing Manager provides new content for a frame where the client does not, we can schedule that for display using a timer before vblank. When the Compositing Manager provides new content after the client, it would be delayed until the next frame.

Changes in Compositing Managers

As described above, one explicit goal is to ease the burden on Compositing Managers by making them able to opt-in to this new mechanism for a limited set of windows and only for a limited set of frames. Any time they need to take control over the screen presentation, a new frame can be constructed with an empty Auto List.

Implementation Plans

This post is the first step in developing these ideas to the point where a prototype can be built. The next step will be to take feedback and adapt the design to suit. Of course, there's always the possibility that this design will also prove unworkable in practice, but I'm hoping that this third attempt will actually succeed.

February 02, 2018

Long time no see, something had happened for sure. So let’s begin with that.

The past

My last post was from 25th August 2017. It was about my GSoC project and how I was preparing the final patch set, that would then be posted to the xorg-devel mailing list.

That’s quite some time ago and I also didn’t follow up on what exactly happened now with the patch set.

Regarding the long pause in communication, it was because of my Master’s thesis in mathematics. I finished it in December and the title is “Vertex-edge graphs of hypersimplices: combinatorics and realizations”.

While the thesis was a lot of work, I’m very happy with the result. I found a relatively intuitive approach to hypersimplices describing them as geometric objects and in the context of graph theory. I even wrote a small application that calculates certain properties of arbitrary hypersimplices and depicts their spectral representations up to the fourth dimension with Qt3D.

I’m currently waiting for my grade, but besides that my somewhat long student career suddenly came to an end.

Regarding my patch set: It did not get merged directly, but I got some valuable feedback from experienced Xserver devs back then. Of course I didn’t want to give up on them, but I had to first work on my thesis and I planned to rework the patches once the thesis was handed in.

At this time I also watched some of the videos from XDC2017 and was happyily surprised that my mentor, Daniel Stone said that he wants my GSoC work in the next Xserver release. His trust in my work really motivated me. I had also some contact to Intel devs, who said that they look forward to my project being merged.

So after I handed in my thesis, I first was working on some other stuff and also needed some time off after the exhausting thesis end phase, but in the last two weeks I reworked my patches and posted a new patch set to the mailing list. I hope this patch set can be accepted in the upcoming Xserver 1.20 release.

The future

I already knew for a prolonged time, that after my master’s degree in mathematics I wanted to leave university and not go for a scientific career. The first reason for this was, that after 10 years of study, most of the time with very abstract topics, I just wanted to interact with some real world problems again. And in retrospective I always was most motivated in my studies when I could connect abstract theory with practical problems in social science or engineering.

Since computers were a passion of mine already at a young age, the currently most interesting techonological achievements happen in the digital field and it is somewhat near to the work of a mathematician, I decided to go into this direction.

I had participated in some programming courses through my studies - and in one semester break created a Pong clone in Java for mobile phones being operated by phone movement; it was fun but will forever remain in the depths of one of my old hard disks somewhere - but I had to learn much more if I wanted to work on interesting projects.

In order build up my own experience pretty exactly two years ago I picked a well-known open-source project, which I found interesting for several reasons, to work on. Of course first I did baby steps, but later on I could accelerate.

So while writing the last paragraph it became apparent to me, that indeed this all was still describing the past. But to know where you’re heading, you need to know where you’re coming from, bla, bla. Anyways finally looking forward I now have the great opportunity to work full-time on KDE technology thanks to Blue Systems.

This foremost means to me to help Martin with the remaining tasks for making Plasma Wayland the new default. I will also work on some ARM devices, what in particular means being more exposed to kernel development. That sounds interesting!

Finally with my GSoC project, I already have some experience working on an upstream freedesktop.org project. So another goal for me is to foster the relationship of the Plasma project with upstream graphics development by contributing code and feedback. In comparision to GNOME we were a bit underrepresented in this regard, most of all through financial constraints of course.

Another topic, more long-term, that I’m personally interested in, is KWin as a VR/AR platform. I imagine possible applications kind of like Google tried it with their Glass project. Just as a full desktop experience with multiple application windows floating in front of you. Basically like in every other science fiction movie up to date. But yeah, first our Wayland session, then the future.

The FOSDEM

Writing these lines I’m sitting in a train to Brussels. So if you want to meet up and talk about anything, you will presumably often find me the next two days at the KDE booth or on Saturday in the graphics devroom. But this is my first time at FOSDEM, so maybe I’ll just stand somewhere in between and am not able to orientate myself anymore. In this case please lead me back to the KDE booth. Thanks in advance and I look forward to meeting you and other interesting people in the next two days at FOSDEM.

Oracle Solaris 11.4 Beta (#solaris114beta) was released earlier this week, here is the announcement blog in case you missed it.

There are lots of updates in this release, including many improvements that simplify development and our ELF and linker support.  Check out these excellent posts from our very own Ali Bahrami to learn more!

Hi All,

Update: Thank you everyone for all the test-reports I've received. The response has been quite overwhelming, with over 50 test-reports received sofar. The results are all over the place, some people see no changes, some people report the aprox. 0.5W saving my own test show and many people also report display problems, sometimes combined with a significant increase in power-consumption. I need to take a closer look at all the results, but right now I believe that the best way forward with this is (unfortunately) a whitelist matching on a combination of panel-id (from edid) and dmi data, so that we can at least enable this on popular models (any model with atleast one user willing to contribute).

As you've probably read already I'm working on improving Linux laptop battery live, previously I've talked about enabling SATA link powermanagement by default. This is now enabled in rawhide / Fedora 28 since January 1st and so far no issues have been reported. This is really good news as this leads to significantly better idle power consumption (1 - 1.5W lower) on laptops with sata disks. Fedora 28 will also enable HDA codec autosuspend and autosuspend for USB Bluetooth controllers, for another (aprox) 0.8W gain.

But we're not done yet, testing on a Lenovo T440s and X240 has shown that enabling Panel Self Refresh (PSR) by setting i915.enable_psr=1 saves another 0.5W. Enabling this on all machines has been tried in the past and it causes problems on some machines. So we will likely need either a blacklist or whitelist for this. I'm leaning towards a whitelist to avoid regressions, but if there are say only 10-20 models which don't work with it a blacklist makes more sense. So the most important thing to do right now is gather more data, hence this blog post.

So I would like to ask everyone who runs Linux on their laptop (with a recent-ish kernel) to test this and gather some data for me:

  1. Check if your laptop uses an eDP panel, do: "ls /sys/class/drm" there should be a card?-eDP-1 there, if not your laptop is using LVDS or DSI for the panel, and this does not apply to your laptop.

  2. Check that your machine supports PSR, do: "cat /sys/kernel/debug/dri/0/i915_edp_psr_status", if this says: "PSR not supported", then this does not apply to your laptop.

  3. Get a baseline powerconsumption measurement, install powertop ("sudo dnf install powertop" on Fedora), then close all your apps except for 1 terminal, maximimze that terminal and run "sudo powertop". Unplug your laptop if plugged in and wait 5 minutes, on some laptops the power measurement is a moving average so this is necessary to get a reliable reading. Now look at the power consumption shown (e.g. 7.95W), watch it for a couple of refreshes as it sometimes spikes when something wakes up to do some work, write down the lowest value you see, this is our base value for your laptops power consumption, write this down.                        Note beware of "dim screen when idle" messing with your brightness, either make sure you do no touch the laptop for a couple of minutes before taking the reading, or turn this feature of in your power-settings.

  4. Add "i915.enable_psr=1" to your kernel cmdline and reboot, check that the LCD panel still works, try suspend/resume and blanking the screen (by e.g. locking it under GNOME3) still work.

  5. Check that psr actually is enabled now (you're panel might not support it), do: "cat /sys/kernel/debug/dri/0/i915_edp_psr_status" and check that it says both: "Enabled: yes" and  "Active: yes"

  6. Measure idle powerconsumption again as described under 1. Make sure you use the same LCD brightness setting as before, write down the new value

  7. Dump your LCD panels edid, run "cat /sys/class/drm/card0-eDP-1/edid > panel-edid"

  8. Send me a mail at hdegoede@redhat.com with the following in there:


  • Report of success or bad side effects

  • The idle powerconsumption before and after the changes

  • The brand and model of your laptop

  • The "panel-edid" file attached

  • The output of the following commands:

  • cat /proc/cpuinfo | grep "model name" | uniq

  • cat /sys/class/dmi/id/modalias

Once I've info from enough models hopefully I can come up with some way for us to enable PSR be default, or at least build a whitelist with popular laptop models and enable it there.

Thanks & Regards,

Hans

In Solaris 11.3 we provided the ability to use the Silicon Secured Memory feature of the Oracle SPARC processors in the M7 and M8 families. An API for applications to explicitly manage ADI (Application Data Integrity) versioning was provided, see adi(2) man page, as well as new memory allocator library - libadimalloc(3LIB).

This required either code changes to the application or arranging to set LD_PRELOAD_64=/usr/lib/64/libadimalloc.so.1 in the environment variables before the application started. The libadimalloc(3LIB) allocator was derived from the libumem(3LIB) codebase but doesn't expose all of the features that libumem does.

With Oracle Solaris 11.4 Beta the use of ADI has been integrated into the default system memory allocator in libc(3LIB) and libumem(3LIB), while retaining libadimalloc(3LIB) for backwards compatibility with Oracle Solaris 11.3 systems.

Control of which processes run with ADI protection is now via the Security Extensions Framework, usng sxadm(8), so it is no longer necessary to set the $LD_PRELOAD_64 environment variable.

There are two distinct ADI based protections exposed via the Security Extensions Framework: ADISTACK and ADIHEAP. To complement the existing extensions introduced in earlier Oracle Solaris 11 update releases: ASLR, NXHEAP, NXSTACK (all three of which are available on SPARC and x86 CPU systems).

ADIHEAP is how the ADI protection is exposed via the standard libc memory allocator and via libumem. The ADISTACK extension as the name suggests is for protectiong the register save area of the stack.

$ sxadm status EXTENSION STATUS CONFIGURATION aslr enabled (tagged-files) default (default) nxstack enabled (all) default (default) nxheap enabled (tagged-files) default (default) adiheap enabled (tagged-files) default (default) adistack enabled (tagged-files) default (default)

The above output from sxadm shows the default configuration of an Oracle SPARC M7/M8 based system. What we can see here is that some of the security extensions, including adiheap/adistack, are enabled by default only for tagged-files. Executable binaries can be tagged using ld(1) as documented in sxadm(8), for example if we want to tag an application at build time to use adiheap we would add '-z sx=adiheap'. Note it is not meaningful at this time to tag shared libaries only leaf executable programs.

Most executables in Oracle Solaris were already tagged to run with the aslr, nxstack, nxheap security extensions. Now many of them are also tagged for ADISTACK and ADIHEAP as well. For the Oracle Solaris 11.4 release we have also had to explicitly tag some executables to not run with ADIHEAP and/or ADISTACK, this is either due to outstanding issues when running with an ADI allocator or in some cases to more fundamental issues with how the prgoram itself works (ImageMagic graphics image processing tool is one such example where ADISTACK is explicily disabled).

The sxadm command can be used to start processes with security extensions enabled regardless of the system wide status and binary tagging. For example to start a program that was not tagged at build time with both ADI based protections, in addtion to its binary tagged extensions:

$ sxadm exec -s adistack=enable -s adiheap=enable /path/to/program

It is possible to edit binary executables to add the security extension tags, even if there were none present at link time. Explicit tagging of binaries already installed on a system and delivered by any package management software is not recommened.

If all of the untagged applications that are deployed to be run on a system have been tested to work with the ADI protections then it is possible to chane the system wide defaults rather than having to use sxadm to run the processes:

# sxadm enable adistack,adiheap

The Oracle Solaris 11.4 Beta also has support for use of ADI to protect kernel memory, that is currently undocumented but is planned to be exposed via sxadm by 11.4 release or soon after. The KADI support also includes a signifcant amount of ADI support in mdb, for both live and post-mortem kernel debugging. KADI is enabled by default with precise traps when running a debug build of the kernel. The debug builds are published in the public Oracle Solaris 11.4 Beta repository and can be enabled by running:

# pkg change-variant debug.osnet=true

The use of ADI via the standard libc and libumem memory allocators and by the kernel (in LDOMs and Zones including with live migration/suspend) has enabled the Oracle Solaris engineering team to find and fix many otherwise difficult to find or diagnose bugs. However we are not yet at a point where we believe all applications from all vendors are sufficiently well behaved that the ADISTACK and ADIHEAP protections can be enabled by default.

I’ve done a talk about the kernel community. It’s a hot take, but with the feedback I’ve received thus far I think it was on the spot, and started a lot of uncomfortable, but necessary discussion. I don’t think it’s time yet to give up on this project, even if it will take years.

Without further ado the recording of my talk “Burning Down the Castle is on youtueb”. For those who prefer reading, LWN has you covered with “Too many lords, not enough stewards”. I think Jake Edge and Jon Corbet have done an excellent job in capturing my talk in a balanced fashion.

Further Discussion

For understanding abuse dynamics I can’t recommend “Why Does He Do That?: Inside the Minds of Angry and Controlling Men” by Lundy Bancroft enough. All the examples are derived from a few decades of working with abusers in personal relationships, but the patterns and archetypes that Lundy Bancroft extracts transfers extremely well to any other kind of relationship, whether that’s work, family or open source communities.

There’s endless amounts of stellar talks about building better communities. I’d like to highlight just two: “Life is better with Rust’s community automation” by Emily Dunham and “Have It Your Way: Maximizing Drive-Thru Contribution” by VM Brasseur. For learning more there’s lots of great community topic tracks at various conferences, but also dedicated ones - often as unconferences: Community Leadership Summit, including its various offsprings and maintainerati are two I’ve been at and learned a lot.

Finally there’s the fun of trying to change a huge existing organization with lots of inertia. “Leading Change” by John Kotter has some good insights and frameworks to approach this challenge.

Despite what it might look like I’m not quitting kernel hacking nor the X.org community, and I’m happy to discuss my talk over mail and in upcoming hallway tracks.

February 01, 2018

Frequently it is desirable to compare two ELF files. As someone who makes changes to the link-editor, comparing large numbers of built objects is a vital part of verifying any changes. In addition, determining what objects have changed from one build to another, can reduce object distribution to only those objects that have changed. Often, it is simply enlightening to know "what did I change in this ELF file to make it different?".

Various tools exist to compare ELF files, often being scripts that call upon tools like elfdump(1), dis(1), and od(1), to analyze sections in more detail. These tools can be rather slow and produce voluminous output.

ELF files have an inherent problem when trying to analyze differences — even a small change to a section within an object, ie. code changes to .text, .data or .rodata, can result in offset changes that ripple through the ELF file affecting many sections and the data these sections contain. Trying to extrapolate the underlying cause of a difference between two ELF files, amongst all the differences that exist, can be overwhelming.

elfdiff(1) attempts to analyze two ELF files and diagnose the most important changes. Typically, the most significant changes to an object can be conveyed from changes to the symbol table. Functions and data items get added or deleted, or change size. Most of the time this can be sufficient to know/confirm what has changed.

After providing any symbol information, elfdiff continues to grovel down into individual sections, and indicate what might have changed. The output style of the diagnostics are a mix of dis(1) for function diffs, od(1) for data diffs, and elfdump(1) style for sections that elfdump understands and provides high level formatted displays for.

The output is limited. A handful of symbols are displayed first. Sections report a single line of difference, or a single line of difference for each symbol already diagnosed. The styles of each difference, and the order in which they are displayed is covered in the elfdiff(1) man page.

This is an overview diff, appropriate for answering questions such as "What are the high level differences between two nightly builds". It does not replace lower level tools such as elfdump(1), but rather, provides a higher level analysis that might then be used to guide the use of lower level tools.

Some files may contain sections that always change from one build to another, things like comment or signature sections. These can be ignored with the -i option. Sometimes only one or two sections are of interested. These can be specified with the -s option. If you really want to see all the differences between two files, use the -u option. But be careful, the output can be excessive.

The following provides an example of comparing two versions of a shared object, and is lifted directly from the elfdiff(1) man page.

$ elfdiff -e foo.so.1 foo.so.2 *** symbols: differ < [9287] 0x935c0 0x1bd FUNC GLOB D 0 .text device_offline > [9287] 0x935c0 0x1f5 FUNC GLOB D 0 .text device_offline --- < [10233] 0x111240 0x20 FUNC GLOB D 0 .text new_device_A < [10010] 0x111260 0x64 FUNC GLOB D 0 .text new_device_B --- > [15317] 0 0 NOTY GLOB D 0 UNDEF __assfailline__13 *** section: [1].SUNW_cap: shdr information differs < sh_size: 0xe0 sh_type: [ SHT_SUNW_cap ] > sh_size: 0x120 sh_type: [ SHT_SUNW_cap ] *** section: [1].SUNW_cap: data information differs < 0x80: [8] CA_SUNW_ID 0x2317 i86pc-clmul > 0x80: [8] CA_SUNW_ID 0x1f59 i86pc-avx2 *** section: [6].text: shdr information differs < sh_size: 0x38e205 sh_type: [ SHT_PROGBITS ] > sh_size: 0x38e245 sh_type: [ SHT_PROGBITS ] *** section: [6].text: data information differs --- <sym>: device_offline() < 0x935d9:<sym>+0x19: 48 8b df movq %rdi,%rbx > 0x935d9:<sym>+0x19: 4c 8b e7 movq %rdi,%r12 --- < <sym>: new_device_A < 0x111240:<sym>: 55 push %rbp --- < <sym>: new_device_B < 0x111260:<sym>: 55 push %rbp *** section: [9].strtab: shdr information differs < sh_size: 0x642c5 sh_type: [ SHT_STRTAB ] > sh_size: 0x642d9 sh_type: [ SHT_STRTAB ] *** section: [9].strtab: data information differs < 0x42297: n e _ _ 1 3 8 5 \0 _ _ ... > 0x42297: n e _ _ 1 3 8 4 \0 _ _ ... *** section: [13].rela.text: shdr information differs < sh_size: 0x36d398 sh_type: [ SHT_RELA ] > sh_size: 0x36d428 sh_type: [ SHT_RELA ] *** section: [13].rela.text: data information differs < 0x0: [0] R_AMD64_32S 0x1635f4 0x1638b4 .text > 0x0: [0] R_AMD64_32S 0x163634 0x1638f4 .text *** section: [33].SUNW_ctf: shdr information differs < sh_size: 0xb33c sh_type: [ SHT_PROGBITS ] > sh_size: 0xb4e4 sh_type: [ SHT_PROGBITS ] *** section: [33].SUNW_ctf: data information differs < 0xd: \0 \0 \0 \08 \0 \0 \0 \b2 \03 \0 \0 D ... > 0xd: \0 \0 \0 \08 \0 \0 \0 \ba \03 \0 \0 $ ... *** section: [34].SUNW_signature: data information differs < 0x73: j o h n d o e \t \93 \ab \ff \fa ... > 0x73: j o h n d o e \c2 \c5 \98 r a ...

After the release of the Oracle Solaris 11.4 Beta and the post on the new observability features by James McPherson I've had a few folks ask me if it's possible to export the data from the StatsStore into a format like CSV (Comma-separated values) so they can easily import this into something like Excel.

The answer is: Yes

The main command to access the StatsStore through the CLI is sstore(1), which you can either use as a single command or you can use it as an interactive shell-like environment. For example to browse the statistics namespace. The other way to access the StatsStore is through the Oracle Solaris Dashboard through a browser, where you point to the system's IP address on port 6787. A third way to access the data is through the REST interface (which the Dashboard actually also using to get it's data) but this is something for a later post.

As James pointed out in his post you can use sstore(1) to list the currently available resources, and you can use export to pull data from one or more of those resources. And it's with this last option you can specify the format you want this data to be exported as. The default is tab separated:

demo@solaris-vbox:~$ sstore export -t 2018-02-01T06:47:00 -e 2018-02-01T06:52:00 -i 60 '//:class.cpu//:stat.usage' TIME VALUE IDENTIFIER 2018-02-01T06:47:00 20286401.157722 //:class.cpu//:stat.usage 2018-02-01T06:48:00 20345863.706499 //:class.cpu//:stat.usage 2018-02-01T06:49:00 20405363.144286 //:class.cpu//:stat.usage 2018-02-01T06:50:00 20465694.085729 //:class.cpu//:stat.usage 2018-02-01T06:51:00 20525877.600447 //:class.cpu//:stat.usage 2018-02-01T06:52:00 20585941.862812 //:class.cpu//:stat.usage

But you can also get it in CSV:

demo@solaris-vbox:~$ sstore export -F csv -t 2018-02-01T06:47:00 -e 2018-02-01T06:52:00 -i 60 '//:class.cpu//:stat.usage' time,//:class.cpu//:stat.usage 1517496420000000,20286401.157722 1517496480000000,20345863.706499 1517496540000000,20405363.144286 1517496600000000,20465694.085729 1517496660000000,20525877.600447 1517496720000000,20585941.862812

And in JSON:

demo@solaris-vbox:~$ sstore export -F json -t 2018-02-01T06:47:00 -e 2018-02-01T06:52:00 -i 60 '//:class.cpu//:stat.usage' { "__version": 1, "data": [ { "ssid": "//:class.cpu//:stat.usage", "records": [ { "start-time": 1517496420000000, "value": 20286401.157722 }, { "start-time": 1517496480000000, "value": 20345863.706498999 }, { "start-time": 1517496540000000, "value": 20405363.144285999 }, { "start-time": 1517496600000000, "value": 20465694.085728999 }, { "start-time": 1517496660000000, "value": 20525877.600446999 }, { "start-time": 1517496720000000, "value": 20585941.862812001 } ] } ] }

Each of these have their own manual entries sstore.cvs(5) and sstore.json(5).

Now the question rises: How do you get something interesting/useful? Well, part of this is about learning what the StatsStore can gather for you and the types of tricks you can do with the data before you export it. This is where the Dashboard is a great learning guide. When you first log in you get a landing page very similar to this:

Note: The default install of Oracle Solaris won't have a valid cert and the browser will complain it's an untrusted connection. Because you know the system you can add an exception and connect.

Because this post is not about exploring the Dashboard but about exporting data I'll just focus on that. But by all means click around.

So if you click on the "CPU Utilization by mode (%)" graph you're essentially double clicking on that data and you'll got to a statistics sheet we've built showing all kinds of aspects on CPU utilization and this should look something like this:

Note: You can see my VirtualBox instance is pretty busy.

So these graphs look pretty interesting, but how do I get to this data? Well, if we're interested in the Top Processes, first click on Top Processes by CPU Utilization and this should bring up this overlay window:

Note: This shows this statistic is only temporarily collected (something you could make persistent here) and that the performance impact of collecting this statistic is very low.

Now click on "proc cpu-percentage" and this will show what is being collected to create this graph:

This shows the SSID of the data in this graph. A quick look at this show it's looking in the process data //:class.proc, then it's using a wildcard on the resources //:res.* which grabs all the entries available, then it selects the statistic for CPU usage in percent //:stat.cpu-percentage, and finally it does a top operation on this list and selects the to 5 processes //:op.top(5) (see ssid-op(7) for more info). And when I use this on the command line I get:

demo@solaris-vbox:~$ sstore export -F CSV -t 2018-02-01T06:47:00 -i 60 '//:class.proc//:res.*//:stat.cpu-percentage//:op.top(5)' time,//:class.proc//:res.firefox/2035/demo//:stat.cpu-percentage//:op.top(5),//:class.proc//:res.rad/204/root//:stat.cpu-percentage//:op.top(5),//:class.proc//:res.gnome-shell/1316/demo//:stat.cpu-percentage//:op.top(5),//:class.proc//:res.Xorg/1039/root//:stat.cpu-percentage//:op.top(5),//:class.proc//:res.firefox/2030/demo//:stat.cpu-percentage//:op.top(5) 1517496480000000,31.378174,14.608765,1.272583,0.500488,0.778198 1517496540000000,33.743286,8.999634,3.271484,1.477051,2.059937 1517496600000000,41.018677,9.545898,5.603027,3.170776,3.070068 1517496660000000,37.011719,8.312988,1.940918,0.958252,1.275635 1517496720000000,29.541016,8.514404,9.561157,4.693604,0.869751

Where "-F CSV" tells it to output to CSV (I could also have used lowercase csv), "-t 2018-02-01T06:47:00" is the begin time of what I want to look at, I'm not using an end time which would be similar but then with an "-e", the "-i 60" tells it I want the length of each sample to be 60 seconds, and then I use the SSID from above.

Note: For the CSV export to work you'll need to specify at least the begin time (-t) and the length of each sample (-i), otherwise the export will error. You also want to export data the StatStore has actually gathered or it will also not work.

In the response the first line is the header with what each column is (time, firefox, rad, gnome-shell, Xorg, firefox) and then the values where the first column is UNIX time.

Similarly if I look at what data is driving the CPU Utilization graph I get the following data with this SSID:

demo@solaris-vbox:~$ sstore export -F csv -t 2018-02-01T06:47:00 -i 60 '//:class.cpu//:stat.usage//:part.mode(user,kernel,stolen,intr)//:op.rate//:op.util' time,//:class.cpu//:stat.usage//:part.mode(user,kernel,stolen,intr)//:op.rate//:op.util(intr),//:class.cpu//:stat.usage//:part.mode(user,kernel,stolen,intr)//:op.rate//:op.util(kernel),//:class.cpu//:stat.usage//:part.mode(user,kernel,stolen,intr)//:op.rate//:op.util(user) 1517496420000000,2.184663,28.283780,31.322588 1517496480000000,2.254090,16.524862,32.667445 1517496540000000,1.568696,19.479255,41.112911 1517496600000000,1.906700,18.194955,39.069998 1517496660000000,2.326821,18.103397,39.564789 1517496720000000,2.484758,17.909993,38.684371

Note: Even though we've asked for data on user, kernel, stolen, and intr (interrupts), it doesn't return data on stolen as it doesn't have this.

Also Note: It's using two other operations rate and util in combination to create this result (also see ssid-op(7) for more info).

This should allow you to click around the Dashboard and learn what you can gather and how to export it. We'll talk more on mining interesting data and for example using the JSON output in later posts.

Khronos has recently announced the conformance program for OpenGL 4.6 and I am very happy to say that Intel has submitted successful conformance applications for various of its GPU models for the Mesa Linux driver. For specifics on the conformant hardware you can check the list of conformant OpenGL products at the Khronos webstite.

Being conformant on day one, which the Intel Mesa Vulkan driver also obtained back in the day, is a significant achievement. Besides Intel Mesa, only NVIDIA managed to do this, which I think speaks of the amount of work and effort that one needs to put to achieve it. The days where Linux implementations lagged behind are long gone, we should all celebrate this and acknowledge the important efforts that companies like Intel have put into making this a reality.

Over the last 8-9 months or so, I have been working together with some of my Igalian friends to keep the Intel drivers (for both OpenGL and Vulkan) conformant, so I am very proud that we have reached this milestone. Kudos to all my work mates who have worked with me on this, to our friends at Intel, who have been providing reviews for our patches, feedback and additional driver fixes, and to many other members in the Mesa community who have contributed to make this possible in one way or another.

Of course, OpenGL 4.6 conformance requires that we have an implementation of GL_ARB_gl_spirv, which allows OpenGL applications to consume SPIR-V shaders. If you have been following Igalia’s work, you have already seen some of my colleagues sending patches for this over the last months, but the feature is not completely upstreamed yet. We are working hard on this, but the scope of the implementation that we want for upstream is rather ambitious, since it involves to (finally) have a full shader linker in NIR. Getting that to be as complete as the current GLSL linker and in a shape that is good enough for review and upstreaming is going to take some time, but it is surely a worthwhile effort that will pay off in the future, so please look forward to it and be patient with us as we upstream more of it in the coming months.

It is also important to remark that OpenGL 4.6 conformance doesn’t just validate new features in OpenGL 4.6, it is a full conformance program for OpenGL drivers that includes OpenGL 4.6 functionality, and as such, it is a super set of the OpenGL 4.5 conformance. The OpenGL 4.6 CTS does, in fact, incorporate a whole lot of bugfixes and expanded coverage for OpenGL features that were already present in OpenGL 4.5 and prior.

What is the conformance process and why is it important?

It is a well known issue with standards that different implementations are not always consistent. This can happen for a number of reasons. For example, implementations have bugs which can make something work on one platform but not on another (which will then require applications to implement work arounds). Another reason for this is that some times implementators just have different interpretations of the standard.

The Khronos conformance program is intended to ensure that products that implement Khronos standards (such as OpenGL or Vulkan drivers) do what they are supposed to do and they do it consistently across implementations from the same or different vendors. This is achieved by producing an extensive test suite, the Conformance Test Suite (or CTS for short), which aims to verify that the semantics of the standard are properly implemented by as many vendors as possible.

Why is CTS different to other test suites available?

One reason is that CTS development includes Khronos members who are involved in the definition of the API specifications. This means there are people well versed in the spec language who can review and provide feedback to test developers to ensure that the tests are good.

Another reason is that before new tests go in, it is required that there are at least a number of implementation (from different vendors) that pass them. This means that various vendors have implemented the related specifications and these different implementations agree on the expected result, which is usually a good sign that the tests and the implementations are good (although this is not always enough!).

How does CTS and the Khronos conformance process help API implementators and users?

First, it makes it so that existing and new functionality covered in the API specifications is tested before granting the conformance status. This means that implementations have to run all these tests and pass them, producing the same results as other implementations, so as far as the test coverage goes, the implementations are correct and consistent, which is the whole point of this process: it wont’ matter if you’re running your application on Intel, NVIDIA, AMD or a different GPU vendor, if your application is correct, it should run the
same no matter the driver you are running on.

Now, this doesn’t mean that your application will run smoothly on all conformant platforms out of the box. Application developers still need to be aware that certain aspects or features in the specifications are optional, or that different hardware implementations may have different limits for certain things. Writing software that can run on multiple platforms is always a challenge and some of that will always need to be addressed on the application side, but at least the conformance process attempts to ensure that for applications that do their part of the work, things will work as intended.

There is another interesting implication of conformance that has to do with correct API specification. Designing APIs that can work across hardware from different vendors is a challenging process. With the CTS, Khronos has an opportunity to validate the specifications against actual implementations. In other words, the CTS allows Khronos to verify that vendors can implement the specifications as intended and revisit the specification if they can’t before releasing them. This ensures that API specifications are reasonable and a good match for existing hardware implementations.

Another benefit of CTS is that vendors working on any API implementation will always need some level of testing to verify their code during development. Without CTS, they would have to write their own tests (which would be biased towards their own interpretations of the spec anyway), but with CTS, they can leave that to Khronos and focus on the implementation instead, cutting down development times and sharing testing code with other vendors.

What about Piglit or other testing frameworks?

CTS doesn’t make Piglit obsolete or redundant at all. While CTS coverage will improve over the years it is nearly impossible to have 100% coverage, so having other testing frameworks around that can provide extra coverage is always good.

My experience working on the Mesa drivers is that it is surprisingly easy to break stuff, specially on OpenGL which has a lot of legacy stuff in it. I have seen way too many instances of patches that seemed correct and in fact fixed actual problems only to later see Piglit, CTS and/or dEQP report regressions on existing tests. The (not so surprising) thing is that many times, the regressions were reported on just one of these testing frameworks, which means they all provide some level of coverage that is not included in the others.

It is for this reason that the continuous integration system for Mesa provided by Intel runs all of these testing frameworks (and then some others). You just never get enough testing. And even then, some regressions slip into releases despite all the testing!

Also, for the case of Piglit in particular, I have to say that it is very easy to write new tests, specially shader runner tests, which is always a bonus. Writing tests for CTS or dEQP, for example, requires more work in general.

So what have I been doing exactly?

For the last 9 months or so, I have been working on ensuring that the Intel Mesa drivers for both Vulkan and OpenGL are conformant. If you have followed any of my work in Mesa over the last year or so, you have probably already guessed this, since most of the patches I have been sending to Mesa reference the conformance tests they fix.

To be more thorough, my work included:

  1. Reviewing and testing patches submitted for inclusion in CTS that either fixed test bugs, extended coverage for existing features or added new tests for new API specifications. CTS is a fairly active project with numerous changes submitted for review pretty much every day, for OpenGL, OpenGL ES and Vulkan, so staying on top of things requires a significant dedication.

  2. Ensuring that the Intel Mesa drivers passed all the CTS tests for both Vulkan and OpenGL. This requires to run the conformance tests, identifying test failures, identifying the cause for the failures and providing proper fixes. The fixes would go to CTS when the cause for the issue was a bogus test, to the driver, when it was a bug in our implementation or the fact that the driver was simply missing some functionality, or they could even go to the OpenGL or Vulkan specs, when the source of the problem was incomplete, ambiguous or incorrect spec language that was used to drive the test development. I have found instances of all these situations.

Where can I get the CTS code?

Good news, it is open source and available at GitHub.

This is a very important and welcomed change by Khronos. When I started helping Intel with OpenGL conformance, specifically for OpenGL 4.5, the CTS code was only available to specific Khronos members. Since then, Khronos has done a significant effort in working towards having an open source testing framework where anyone can contribute, so kudos to Khronos for doing this!

Going open source not only leverages larger collaboration and further development of the CTS. It also puts in the hands of API users a handful of small test samples that people can use to learn how some of the new Vulkan and OpenGL APIs released to the public are to be used, which is always nice.

What is next?

As I said above, CTS development is always ongoing, there is always testing coverage to expand for existing features, bugfixes to provide for existing tests, more tests that need to be adapted or changed to match corrections in in the spec language, new extensions and APIs that need testing coverage, etc.

And on the driver side, there are always new features to implement that come with their potential bugs that need to be fixed, occasional regressions that need to be addressed promptly, new bugs uncovered by new tests that need fixes, etc

So the fun never really stops 🙂

Final words

In this post, besides bringing the good news to everyone, I hope that I have made a good case for why the Khronos CTS is important for the industry and why we should care about it. I hope that I also managed to give a sense for the enormous amount of work that goes into making all of this possible, both on the side of Khronos and the side of the driver developer teams. I think all this effort means better drivers for everyone and I hope that we all, as users, come to appreciate it for that.

Finally, big thanks to Intel for sponsoring our work on Mesa and CTS, and also to Igalia, for having me work on this wonderful project.

OpenGL® and the oval logo are trademarks or registered trademarks of Silicon Graphics, Inc. in the United States and/or other countries worldwide. Additional license details are available on the SGI website.

If you look closely at the listings for the Oracle Solaris 11.4 Reference Manuals and the previous Oracle Solaris 11.3 Reference Manuals, you might notice a change in some sections.  One of our “modernization” projects for this release actually took us back to our roots, in returning to the man page section numbers used in SunOS releases before the adoption of the System V scheme in Solaris 2.0.   When I proposed this change, I dug into the history a bit to explain in the PSARC case to review the switchover.

Unix man pages have been divided into numbered sections for its entire recorded history. The original sections, as seen in the Introduction to the Unix 1st Edition Manual from 1971 & the Unix 2nd Edition Manual from 1972, were:

I. Commands II. System calls III. Subroutines IV. Special files V. File formats VI. User-maintained programs VII. Miscellaneous

Unix Programmer's Manual Seventh Edition cover pageBy Version 7, Bell Labs had switched from Roman numerals to Arabic and updated the definitions a bit:

1. Commands 2. System calls 3. Subroutines 4. Special files 5. File formats and conventions 6. Games 7. Macro packages and language conventions 8. Maintenance

Most Unix derivatives followed this section breakdown, and a very similar set is still used today on BSD, Linux, and MacOS X:

1 General commands 2 System calls 3 Library functions, covering in particular the C standard library 4 Special files (usually devices, those found in /dev) and drivers 5 File formats and conventions 6 Games and screensavers 7 Miscellanea 8 System administration commands and daemons

The Linux Filesystem Hierarchy Standard defines these sections as

man1: User programs
Manual pages that describe publicly accessible commands are contained in this chapter. Most program documentation that a user will need to use is located here.
man2: System calls
This section describes all of the system calls (requests for the kernel to perform operations).
man3: Library functions and subroutines
Section 3 describes program library routines that are not direct calls to kernel services. This and chapter 2 are only really of interest to programmers.
man4: Special files
Section 4 describes the special files, related driver functions, and networking support available in the system. Typically, this includes the device files found in /dev and the kernel interface to networking protocol support.
man5: File formats
The formats for many data files are documented in the section 5. This includes various include files, program output files, and system files.
man6: Games
This chapter documents games, demos, and generally trivial programs. Different people have various notions about how essential this is.
man7: Miscellaneous
Manual pages that are difficult to classify are designated as being section 7. The troff and other text processing macro packages are found here.
man8: System administration
Programs used by system administrators for system operation and maintenance are documented here. Some of these programs are also occasionally useful for normal users.
The Linux man pages also include a non-FHS specified section 9 for "kernel routine documentation."

But of course, one Unix system broke ranks and shuffled the numbering around. USL redefined the man page sections in System V to instead be:

1 General commands 1M System administration commands and daemons 2 System calls 3 C library functions 4 File formats and conventions 5 Miscellanea 7 Special files (usually devices, those found in /dev) and drivers

Most notably moving section 8 to 1M and swapping 4, 5, & 7 around.

Solaris still tried to follow the System V arrangement until now, with some extensions:

1 User Commands 1M System Administration Commands 2 System Calls 3 Library Interfaces and Headers 4 File Formats 5 Standards, Environments, and Macros 6 Games and screensavers 7 Device and Network Interfaces 9 DDI and DKI Interfaces

With Solaris 11.4, we've now given up the ghost of System V and declared Solaris to be back in sync with Bell Labs, BSD, and Linux numbering. Specifically, all existing Solaris man pages using these System V sections were renumbered to the listed standard section:

SysV Standard ---- -------- 1m -> 8 4 -> 5 5 -> 7 7 -> 4

Sections 1, 2, 3, 6, and 9 remain as is, including the Solaris method of subdividing section 3 into per library subdirectories. The subdivisions of section 7 introduced in PSARC/1994/335 have become subdivisions of section 4 instead, for instance ioctls will now be documented in section 4I instead of 7I.

The man command was updated so that if someone specifies one of the remapped sections, it will look first in the section specified, then in any subsections of that section, then the mapped section, and then in any subsections of that section. This will assist users following references from older Solaris documentation to find the expected pages, as well as users of other platforms who don't know our subsections.

For example:

  • If a user did "man -s 4 core", looking for the man page that was delivered as /usr/share/man/man4/core.4, and no such page was found in /usr/share/man/man4/ it would look for /usr/share/man/man5/core.5 instead.
  • If a user did "man -s 3 malloc", it would display /usr/share/man/man3/malloc.3c.
  • The man page previously delivered as ip(7P), and now as ip(4P) could be found by any of:
    • man ip
    • man ip.4p
    • man ip.4
    • man ip.7p
    • man ip.7
    and the equivalent man -s formulations.

Additionally, as long as we were mucking with the sections, we defined two new sections which we plan to start using soon:

2D DTrace Providers 8S SMF Services

The resulting Solaris manual sections are thus now:

1 User Commands 2 System Calls 2D DTrace Providers 3 Library Interfaces and Headers 3* Interfaces split out by library (i.e. 3C for libc, 3M for libm, 3PAM for libpam) 4 Device and Network Interfaces 4D Device Drivers & /dev files 4FS FileSystems 4I ioctls for a class of drivers or subsystems 4M Streams Modules 4P Network Protocols 5 File Formats 6 Games and screensavers 7 Standards, Environments, Macros, Character Sets, and miscellany 8 System Administration Commands 8S SMF Services 9 DDI and DKI Interfaces 9E Driver Entry Points 9F Kernel Functions 9P Driver Properties 9S Kernel & Driver Data Structures

We hope this makes it easier for users and system administrators who have to use multiple OS'es by getting rid of one set of needless differences. It certainly helps us in delivering FOSS packages by not having to change all the manpages in the upstream sources to be different for Solaris just because USL wanted to be different 30 years ago.

* { box-sizing: border-box; } .event { border-radius: 4px; width: 800px; height: 110px; margin: 10px auto 0; margin-left: 0cm; } .event-side { padding: 10px; border-radius: 8px; float: left; height: 100%; width: calc(15% - 1px); box-shadow: 1px 2px 2px 1px #888; background: white; position: relative; overflow: hidden; font-size: 0.8em; text-align: right; } .event-date, .event-time { position: absolute; width: calc(90% - 20px); } .event-date { top: 30px; font: bold 24px Garamond, Georgia, serif; } .dotted-line-separator { right: -2px; position: absolute; background: #fff; width: 5px; top: 8px; bottom: 8px; } .dotted-line-separator .line { /*border-right: 1px dashed #ccc;*/ transform: rotate(90deg); } .event-body { border-radius: 8px; float: left; height: 100%; width: 65%; line-height: 22px; box-shadow: 0 2px 2px -1px #888; background: white; padding-right: 9px; font: bold 16px Garamond, Georgia, serif; } .event-title, .event-location, .event-details { float: left; width: 60%; padding: 15px; height: 33%; } .event-title, .event-location { border-bottom: 1px solid #ccc; } .event-details2 { float: left; width: 60%; padding: 15px; height: 23%; font: bold 24px Garamond, Georgia, serif; }
Solaris OS Beta
11.4 Download Location & Documentation

Recently Solaris 11.4 hit the web as a public beta product meaning anyone can download and use it in non-production environments. This is a major Solaris milestone since the release of Solaris 11.3 GA back in 2015.

Few interesting pages:


Logical Domains
Dynamic Reconfiguration
Blacklisted Resources
Command History

Dynamic Reconfiguration of Named Resources

Starting with the release of Oracle VM Server for SPARC 3.5 (aka LDoms) it is possible to dynamically reconfigure domains that have named resources assigned. Named resources are the resources that are assigned explicitly to domains. Assigning core ids 10 & 11 and a 32 GB block of memory at physical address 0x50000000 to some domain X is an example of named resource assignment. SuperCluster Engineered System is one example where named resources are explicitly assigned to guest domains.

Be aware that depending on the state of the system, domains and resources, some of the dynamic reconfiguration operations may or may not succeed.

Here are few examples that show DR functionality with named resources.

ldm remove-core cid=66,67,72,73 primary ldm add-core cid=66,67 guest1 ldm add-mem mblock=17664M:16G,34048M:16G,50432M:16G guest2

Listing Blacklisted Resources

When FMA detects faulty resource(s), Logical Domains Manager attempts to stop using those faulty core and memory resources (no I/O resources at the moment) in all running domains. Also those faulty resources will be preemptively blacklisted so they don't get assigned to any domain.

However if the faulty resource is currently in use, Logical Domains Manager attempts to use core or memory DR to evacuate the resource. If the attempt fails, the faulty resource is marked as "evacuation pending". All such pending faulty resources are removed and moved to blacklist when the affected guest domain is stopped or rebooted.

Starting with the release of LDoms software 3.5, blacklisted and evacuation pending resources (faulty resources) can be examined with the help of ldm's -B option.

eg., # ldm list-devices -B CORE ID STATUS DOMAIN 1 Blacklisted 2 Evac_pending ldg1 MEMORY PA SIZE STATUS DOMAIN 0xa30000000 87G Blacklisted 0x80000000000 128G Evac_pending ldg1

Check this page for some more information.

LDoms Command History

Recent releases of LDoms Manager can show the history of recently executed ldm commands with the list-history subcommand.

# ldm history Jan 31 19:01:18 ldm ls -o domain -p Jan 31 19:01:48 ldm list -p Jan 31 19:01:49 ldm list -e primary Jan 31 19:01:54 ldm history ..

Last 10 ldm commands are shown by default. ldm set-logctl history=<value> command can be used to configure the number of commands in the command history. Setting the value to 0 disables the command history log.


Disks
Determine the Blocksize

devprop command on recent versions of Solaris 11 can show the logical and physical block size of a device. The size is represented in bytes.

eg.,

Following output shows 512-byte size for both logical and physical block. It is likely a 512-byte native disk (512n).

% devprop -v -n /dev/rdsk/c4t2d0 device-blksize device-pblksize device-blksize=512 device-pblksize=512

Find some useful information about disk drives that exceed the common 512-byte block size here.


Security Services
Privileges

When debugging option was enabled, ppriv command on recent versions of Solaris 11 can be used to check if the current user has required privileges to run a certain command.

eg., % ppriv -ef +D /usr/sbin/trapstat trapstat[18998]: missing privilege "file_dac_read" (euid = 100, syscall = "faccessat") for "/devices/pseudo/trapstat@0:trapstat" at devfs_access+0x74 trapstat: permission denied opening /dev/trapstat: Permission denied % ppriv -ef +D /usr/sbin/prtdiag System Configuration: Oracle Corporation sun4v T5240 Memory size: 65312 Megabytes ================================ Virtual CPUs ================================ ..

Following example examines the privileges of a running process.

# ppriv 23829
January 31, 2018
History of the Immutable (ROZR) Zones 

In Solaris 11 11/11 we introduced Immutable non-global zones; these have been built on top of MWAC (Mandatory Write Access Control) using a handful of choices for the file-mac-profile property in zone configurations. Management was only possible by booting the zone read/write or by modifying configuration files from within the global zone.

In Solaris 11.2 we added support for the Immutable Global Zone and so we also added the Immutable Kernel Zone. In order to make maintenance possible for the global zone we added the concept of a Trusted Path login. It is invoked through the abort-sequence for an LDOM or bare metal system and for native and kernel zones using the -T/-U options for zlogin(1).

Limitations

The Trusted Path introduced in Solaris 11.2 was not available in services; changes to the SMF repository were always possible; depending on the file-mac-profile, /etc/svc/repository.db was either writable (not MWAC protected such as in flexible-configuration) and so the changes were permanent.  The immutable zone's configuration was not protected! If the repository was not writable, a writable copy was created in /system/volatile and all changes would not persist across a reboot. In order to make any permanent cases the system needed to be rebooted read/write.

The administrator had two choices: either the changes to the SMF repository were persistent (file-mac-profile=flexible-configuration) or any permanent changes required a r/w boot. In all cases, the behavior of an immutable system could be modified considerably.

When an SMF services was moved into the Trusted Path using ppriv(1),  it could not be killed and the service would go to maintenance on the first attempt to restart or stop the service.

In Solaris 11.4 we updated the Immutable Zone: SMF becomes immutable and we introduce services on the Trusted Path. Persistent SMF changes can be made only when they are made from the Trusted Path. 

SMF becomes Immutable

SMF has two different repositories: the persistent repository which contains all of the system and service configuration and the non-persistent repository; the latter contains the current state of the system, which services are actually running. It also stores the non-persistent property groups such as general_ovr; this property group is used to store whether services are enabled and disabled.

The svc.configd service now runs in the Trusted Path so it can now change the persistent repository regardless of the MWAC profile. Changes made to the persistent repository will now always survive a reboot.

The svc.configd checks whether the caller is running in the Trusted Path; if a process runs in the Trusted Path it is allowed to make changes to the persistent repository. If not, an error is returned.

Trusted Path services

In Solaris 11.4 we introduce a Boolean parameter in the SMF method_credential called "trusted_path"; if it is set to true, the method runs in the Trusted Path. This feature is joined at the hip with Immutable SMF: without the latter, it would be easy to escalate from a privileged process to a privileged process in the Trusted Path.

All these processes need to behave normally, we added a new privilege flag, PRIV_TPD_KILLABLE; such a process even when run in the Trusted Path can be send a signal from outside the Trusted Path.  But clearly such a process cannot be manipulated outside of the Trusted Path so you can't aim a debugger unless the debugger runs in the Trusted Path too.

As the Trusted Path property can only be given or inherited from, init(8), the SMF restarters need to run in the Trusted Path.

This feature allows us to run self-assembly services that do not depend on the self-assembly-complete milestone; instead we can now run them on the Trusted Path; these services can now take as long as they want and they can be run even on each and every boot and even when the service is restarted.

When a system administrator wants to run the console login always on the Trusted Path, he can easily achieve that by running the following command:

# svccfg -s console-login:default setprop trusted_path = boolean: true

# svccfg -s console-login:default refresh

 etc

It is possible in Oracle Solaris 11.4 to write a service which updates and reboots the system; such a service can be started by an administrator outside of the Trusted Path by temporarily enabling such a service. Combined with non-reboot immutable which was introduced in Oracle Solaris 11.3 SRU 12, automatic and secure updates are now possible, without additional downtime.  Similarly there may be use cases for deploying a configuration management service, such as Puppet or Ansible, on the SMF Trusted Path so that it can reconfigure the system but interactive, even root, administrators can not.

Those of you that have ever read my own blog (Ghost Busting) will know I've a long standing interest in trying to get the systems we all use and love to be easier to fix, and ideally tell you themselves what's wrong with them.

Back in Oracle Solaris 11 we added the concept of Software Fault Management Architecture (SWFMA), with two types of event modelled as FMA defects. One was panic events, the other SMF service state transitions. This also allowed for notification of all FMA events via a new service facility (svccfg setnotify) over SNMP and email.

With a brief diversion to make crash dumps smaller, and faster, and easier to diagnose, we've come back to the SWFMA concept and extended it in two ways.

Corediag

It's pretty easy to see that the same concept for modelling System panic as FMA events could be applied to user level process core dumps. So that's what we've done. coreadm(8) has been extended so that by default a diagnostic core file is created for every Solaris binary that crashes. This is smaller than a regular core dump. We then have a service (svc:/system/coremon:default) which runs a daemon (coremond) that will monitor for these being created, and summarize them,. By default the summary file is kept, though you can use coreadm to remove them. coremond then turns these in to FMA alerts. These are considered more informational than full on defects, but are still present in the FMA logs. You can run fmadm(8) like in the screen shot below to see any alerts. This was one I got when debugging a problem with a core file.

  Stackdiag

Over the years we've learned that for known problems, there is a very good correlation between the raw stack trace and an existing bug. We've had tools internally to do this for years. We've mined our bug database and extracted stack information, this is bundled up in to a file delivered by pkg:/system/diagnostic/stackdb. Any time FMA detects there is stack telemetry in an FMA event, and the FMA message indicates we could be looking at a software bug, it'll trigger a lookup and try to add the bug id for significant threads to the FMA message.

So if you look in the message above you'll see the description that says:

Description : A diagnostic core file was dumped in /var/diag/e83476f7-104d-4c85-9de4-bf7e45f261d1 for RESOURCE /usr/bin/pstack whose ASRU is . The ASRU is the Service FMRI for the resource and will be NULL if the resource is not part of a service. The following are potential bugs. stack[0] - 24522117

The stack[0] shows which thread within the process caused in the core dump, and obviously the bugid following it is one you can search for in MOS to find any solution records. Alternatively you can see if you've already got the fix in your Oracle Solaris package repository, by using the pkg search option to search for bugid.

Hang on. If I've got the fix in the package repository why isn't it installed?

The stackdb package is what we refer to as "unincorporated". This is just a data file, and has no code within it, and the latest version you have available will be installed (not just the one matching the SRU you're on). So you can update it regularly to get the latest diagnosis, without updating the SRU on your system or rebooting it. This means you may get information about bugs which are already fixed and the fix is available when you are on older SRUs.

Software FMA

We believe that these are the first steps to a self diagnosing system, and will reduce the need to log SRs for bugs we already know about. Hopefully this will mean you can get the fixes quicker, with minimal process or fuss.

 

Contributed by: Alexandr Nedvedicky

This blog entry covers the migration from IPF to Packet Filter (a.k.a. PF). If your Oracle Solaris 11.3 runs without IPF, then you can stop reading now (well of course, if you're interested in reading about the built-in firewalls you should continue on). The IPF served as a network firewall on Oracle Solaris for more than a decade (Since Oracle Solaris 10). PF on Oracle Solaris is available since Oracle Solaris 11.3 as alternative firewall. Administrator must install it explicitly using 'pkg install firewall'. Having both firewalls shipped during Oracle Solaris 11.3 release cycle should provide some time to prepare for a leap to world without IPF. If you as a SysAdmin have completed your homework and your ipf.conf (et. al.) are converted to pf.conf already, then skip reading to 'What has changed since Oracle Solaris 11.3'

IPF is gone, what now?

On upgrade from Oracle Solaris 11.3 PF will be automatically installed without any action from the administrator. This is implemented by renaming pkg:/network/ipfilter to pkg:/network/ipf2pf and adding a dependency on pkg:/network/firewall. The ipf2pf package installs ipf2pf(7) (svc:/network/ipf2pf) service, which runs at the first boot to newly updated BE. The service inspects the IPF configuration, which is still available in '/etc/svc/repository-boot'.  The ipf2pf start method uses the repository to locate IPF configuration. The IPF configuration is then moved to '/var/firewall/legacy.ipf/conf' directory.  Content of the directory may vary depending on your ipfilter configuration.  The next step is to attempt to convert legacy IPF configuration to pf.conf.  Unlike IPF the PF keeps its configuration in single file named pf.conf (/etc/firewall/pf.conf). The service uses 'ipf2pf' binary tool for conversion. Please do not set your expectations for ipf2pf too high.  It's a simple tool (like hammer or screwdriver) to support your craftsman's skill, while converting implementation of your network policy from IPF to PF.  The tool might work well for simple cases, but please always review the conversion result before deploying it. As soon as ipf2pf service is done with conversion, it updates /etc/firewall/pf.conf with comment to point you to result: '/var/firewall/legacy.ipf/pf.conf'.

Let's see how the tool actually works when it converts your IPF configuration to PF. Let's assume your IPF configuration is kept in ipf.conf, ipnat.conf and ippool-1.conf:

ipf.conf # # net0 faces to public network. we want to allow web and mail # traffic as stateless to avoid explosion of IPF state tables. # mail and web is busy. # # allow stateful inbound ssh from trusted hosts/networks only # block in on net0 from any to any pass in on net0 from any to 192.168.1.1 port = 80 pass in on net0 from any to 172.16.1.15 port = 2525 pass in on net0 from pool/1 to any port = 22 keep state pass out on net0 from any to any keep state pass out on net0 from 192.168.1.1 port = 80 to any pass out on net0 from 192.168.1.1 port = 2525 to any    ipnat.conf # let our private lab network talk to network outside  map net0 172.16.0.0/16 -> 0.0.0.0/32 rdr net0 192.168.1.1/32 port 25 -> 172.16.1.15 port 2525    ippool-1.conf table role = ipf type = tree number = 1 { 8.8.8.8, 10.0.0.0/32 };

In order to convert ipf configuration above, we run ipf2pf as follows:

   ipf2pf -4 ipf.conf -n ipnat.conf -p ippool-1.conf -o pf.conf

The result of conversion pf.conf looks like that:

   #    # File was generated by ipf2pf(7) service during system upgrade. The    # service attempted to convert your IPF rules to PF (the new firewall)    # rules. You should check if firewall configuration here, suggested by    # ipf2pf, still meets your network policy requirements.    #    #    # Unlike IPF, PF intercepts packets on loopback by default.  IPF does not    # intercept packets bound to loopback. To turn off the policy check for    # loopback packets, we suggest to use command below:    set skip on lo0    #    # PF does IP reassembly by default. It looks like your IPF does not have IP    # reassembly enabled. Therefore the feature is turned off.    #    set reassemble no    # In case you change your mind and decide to enable IP reassembly    # delete the line above. Also to improve interoperability    # with broken IP stacks, tell PF to ignore the 'DF' flag when    # doing reassembly. Uncommenting line below will do it:    #    # set reassemble yes no-df    #    # PF tables are the equivalent of ippools in IPF. For every pool    # in legacy IPF configuration, ipf2pf creates a table and    # populates it with IP addresses from the legacy IPF pool. ipf2pf    # creates persistent tables only.    #    table <pool_1> persist { 8.8.8.8, 10.0.0.0 }    #    # Unlike IPF, the PF firewall implements NAT as yet another    # optional action of a regular policy rule. To keep PF    # configuration close to the original IPF, consider using    # the 'match' action in PF rules, which translate addresses.    # There is one caveat with 'match'. You must always write a 'pass'    # rule to match the translated packet. Packets are not translated    # unless they hit a subsequent pass rule. Otherwise, the "match"    # rule has no effect.     #    # It's also important to avoid applying nat rules to DHCP/BOOTP    # requests. The following stateful rule, when added above the NAT    # rules, will avoid that for us.    pass out quick proto udp from 0.0.0.0/32 port 68 to 255.255.255.255/32 port 67    # There are 2 such rules in your IPF ruleset    #    match out on net0 inet from 172.16.0.0/16 to any nat-to (net0)    match in on net0 inet from any to 192.168.1.1 rdr-to 172.16.1.15 port 2525    #    # The pass rules below make sure rdr/nat -to actions    # in the match rules above will take effect.    pass out all    pass in all    block drop in on net0 inet all    #    # IPF rule specifies either a port match or return-rst action,    # but does not specify a protocol (TCP or UDP). PF requires a port    # rule to include a protocol match using the 'proto' keyword.    # ipf2pf always assumes and enters a TCP port number    #    pass in on net0 inet proto tcp from any to 192.168.1.1 port = 80 no state    #    # IPF rule specifies either a port match or return-rst action,    # but does not specify a protocol (TCP or UDP). PF requires a port    # rule to include a protocol match using the 'proto' keyword.    # ipf2pf always assumes and enters a TCP port number    #    pass in on net0 inet proto tcp from any to 172.16.1.15 port = 2525 no state    #    # IPF rule specifies either a port match or return-rst action,    # but does not specify a protocol (TCP or UDP). PF requires a port    # rule to include a protocol match using the 'proto' keyword.    # ipf2pf always assumes and enters a TCP port number    #    pass in on net0 inet proto tcp from <pool_1> to any port = 22 flags any keep state (sloppy)    pass out on net0 inet all flags any keep state (sloppy)    #    # IPF rule specifies either a port match or return-rst action,    # but does not specify a protocol (TCP or UDP). PF requires a port    # rule to include a protocol match using the 'proto' keyword.    # ipf2pf always assumes and enters a TCP port number    #    pass out on net0 inet proto tcp from 192.168.1.1 port = 80 to any no state    #    # IPF rule specifies either a port match or return-rst action,    # but does not specify a protocol (TCP or UDP). PF requires a port    # rule to include a protocol match using the 'proto' keyword.    # ipf2pf always assumes and enters a TCP port number    #    pass out on net0 inet proto tcp from 192.168.1.1 port = 2525 to any no state

As you can see the result pf.conf file is annotated by comments, which explain what happened to original ipf.conf.

What has changed since Oracle Solaris 11.3?

If you already have experience with PF on Solaris, then you are going to notice those changes since Oracle Solaris 11.3:

   firewall in degraded state Firewall service enters degraded state, whenever it gets enabled with default configuration shipped package. This should notify administrator, the system enables firewall with empty configuration. As soon as you alter etc/firewall/pf.conf and refresh firewall service, the service will become online.    firewall in maintenance state firewall enters maintenance state as soon as the service tries to load syntactically invalid configuration. If it happens the smf method inserts a hardwired fallback rules, which drop all inbound sessions excepts ssh.    support for IP interface groups 11.4 comes with support for 'firewall interface groups'. It's a feature, which comes from upstream. The idea is best described by its author Henning Brauer [ goo.gl/eTjn54 ]. The S11.4 brings same feature with Solaris flavor. The interface groups are treated as interface property 'fwifgroup'. To assign interface net0 to group alpha use ipadm(8):     ipadm set-ifprop -p fwifgroup=alpha net0 to show firewall interface groups, which net0 is member of use show-ifprop:     ipadm show-ifprop -p fwifgroup net0 The firewall interface groups are treated as any other interface, which PF rule is bound to. Rule below applies to all packets, which are bound to firewall interface group alpha:     pass on alpha all The firewall interface group is just kind of tag you assign to interface, so PF rules can use it to refer to such interface(s) using tags instead of names. Known Issues

Not everything goes as planned. There is a bunch of changes, which missed the beta-release build. Those are known issues and are going to be addressed in final release.

   firewall:framework SMF instacne This is going to be removed in final release. The feature presents yet another hurdle in upgrade path. It's been decided to postpone it. The things are just unfortunate we failed to remove it for beta.    support for _auto/_static anchors Those are closely related to firewall:framework instance. Support for those two anchors is postponed.    'set skip on lo0' in default pf.conf The 'set skip on lo0' is going to disappear from default pf.conf shipped by firewall package. Unlike IPF    Lack of firewall in Solaris 10 BrandZ With IPF gone there is no firewall available in S10 BrandZ. The only solution for Solaris 10 deployments, which require firewall protection is to move those Solaris 10 BrandZ behind the firewall, which runs outside the Solaris 10 BrandZ. The firewall can be either another network appliance, or it can be Solaris 11 Zone with PF installed. The Solaris 11 Zone will act as L3 router forwarding all traffic for Solaris 10 -Zone. If such solution does not work for your deployment/scenario we hope to hear back from you with some details of your story.

Although there are differences between IPF to PF, the migration should not be that hard as both firewalls have similar concept. We keep PF on Oracle Solaris as much close to upstream as possible, so many guides and recipes found in Internet should work for Oracle Solaris as well, just keep in mind NAT-64, pfsync and bandwidth management were not ported to Oracle Solaris. We hope you'll enjoy the PF ride and will stay tuned for more on the Oracle Solaris 11.4 release.

Links

Oracle Solaris Documentation on IPF, PF, and comparing IPF to PF.

[ goo.gl/YyDMVZ ] https://sourceforge.net/projects/ipfilter/

[ goo.gl/UgwCth ] https://www.openbsd.org/faq/pf/

[ goo.gl/eTjn54 ] https://marc.info/?l=openbsd-misc&m=111894940807554&w=2

James McPherson wrote an excellent blog on the new observability and data gathering tools in the Oracle Solaris 11.4 Beta that are part of the Oracle Solaris Analytics project. Enjoy the read. More on this topic to come.

One of the more subtle changes with Oracle Solaris 11.4 is the identity of the Operating System - namely the output of uname(1).  Obviously we are not changing the release - this is still SunOS 5.11 which brings along the interface stability levels that Oracle Solaris has delivered for decades. However, in part based upon customer feedback, and in part internal requirements, the version level now displays finer grained information:

$ uname -v 11.4.0.12.0

If we compare this to the output from an Oracle Solaris 11.3 machine running Support Repository Update 28:

$ uname -v 11.3

So we now have 3 extra digits to convey more information, whose meaning is:

11.<update>.<sru>.<build>.<reserved>

Update : the Oracle Solaris update. From the above we see this is Oracle Solaris 11 Update 4

SRU : the SRU number. Again, from the above, as this is a beta release, it is not an SRU so the value is 0.

build: the build of the Update, or if the SRU is non-zero, the SRU.

reserved: this is a number that will be used to reflect some internal mechanisms - for example, if we discover a problem with the build but the next build has been started then we will use this number as a 'respin'.

Taking this forward when the 9th SRU is produced the output of uname -v will be:

$ uname -v 11.4.9.4.0

As an SRU typically has 4 builds we see the build number is 4 and, as the SRU is perfect,  the reserved field is 0.

One other important change, which is not immediately obvious, is that the version is no longer encoded in the kernel at build time. Rather it is read from a file at boot: /etc/versions/uts_version. This brings about an important point - the system identity can be changed without the need to deliver a new kernel. This means, that potentially, SRUs can be delivered that modify userland binaries AND can update the system identity (ie no need to reboot to update the system identity). This file is protected from being modified via extended attributes used within the packaging system, and because it is a delivered file it is easy to detect if has been modified:

# pkg verify -p /etc/versions/uts_version PACKAGE                                                                 STATUS pkg://solaris/entire                                                     ERROR         file: etc/versions/uts_version                 ERROR: Hash: 7d5ef997d22686ef7e46cc9e06ff793d7b98fc14 should be 0fcd60579e8d0205c345cc224dfb27de72165d54


What about the previous advice to use 'pkg info entire' to identify the system? Using the packaging system to identify what is running on the system continues to be a sure fire way simply because trying to convey detailed information in 5 digits is really impossible for a variety of reasons. 

The versions (fmri's) of packages have been changed. We have taken the leap to make them more reflective of the release. For example:

$ pkg list -H entire entire              11.4-11.4.0.0.0.12.1       i--

whereas in the 11.3 SRU case:

$ pkg list -H entire entire              0.5.11-0.175.3.28.0.4.0    i--

The fmri now indicates this is 11.4 and the branch scheme (numbers after the '-') has replaced 0.175  with a much more useful release version. The above example indicates Oracle Solaris 11.4, build 12 and a respin of 1. The exact  details of the various digits are documented in the pkg(7).

January 30, 2018

For the last few weeks, Benjamin Tissoires and I have been working on a new project: Tuhi [1], a daemon to connect to and download data from Wacom SmartPad devices like the Bamboo Spark, Bamboo Slate and, eventually, the Bamboo Folio and the Intuos Pro Paper devices. These devices are not traditional graphics tablets plugged into a computer but rather smart notepads where the user's offline drawing is saved as stroke data in vector format and later synchronised with the host computer over Bluetooth. There it can be converted to SVG, integrated into the applications, etc. Wacom's application for this is Inkspace.

There is no official Linux support for these devices. Benjamin and I started looking at the protocol dumps last year and, luckily, they're not completely indecipherable and reverse-engineering them was relatively straightforward. Now it is a few weeks later and we have something that is usable (if a bit rough) and provides the foundation for supporting these devices properly on the Linux desktop. The repository is available on github at https://github.com/tuhiproject/tuhi/.

The main core is a DBus session daemon written in Python. That daemon connects to the devices and exposes them over a custom DBus API. That API is relatively simple, it supports the methods to search for devices, pair devices, listen for data from devices and finally to fetch the data. It has some basic extras built in like temporary storage of the drawing data so they survive daemon restarts. But otherwise it's a three-way mapper from the Bluez device, the serial controller we talk to on the device and the Tuhi DBus API presented to the clients. One such client is the little commandline tool that comes with tuhi: tuhi-kete [2]. Here's a short example:


$> ./tools/tuhi-kete.py
Tuhi shell control
tuhi> search on
INFO: Pairable device: E2:43:03:67:0E:01 - Bamboo Spark
tuhi> pair E2:43:03:67:0E:01
INFO: E2:43:03:67:0E:01 - Bamboo Spark: Press button on device now
INFO: E2:43:03:67:0E:01 - Bamboo Spark: Pairing successful
tuhi> listen E2:43:03:67:0E:01
INFO: E2:43:03:67:0E:01 - Bamboo Spark: drawings available: 1516853586, 1516859506, [...]
tuhi> list
E2:43:03:67:0E:01 - Bamboo Spark
tuhi> info E2:43:03:67:0E:01
E2:43:03:67:0E:01 - Bamboo Spark
Available drawings:
* 1516853586: drawn on the 2018-01-25 at 14:13
* 1516859506: drawn on the 2018-01-25 at 15:51
* 1516860008: drawn on the 2018-01-25 at 16:00
* 1517189792: drawn on the 2018-01-29 at 11:36
tuhi> fetch E2:43:03:67:0E:01 1516853586
INFO: Bamboo Spark: saved file "Bamboo Spark-2018-01-25-14-13.svg"
I won't go into the details because most should be obvious and this is purely a debugging client, not a client we expect real users to use. Plus, everything is still changing quite quickly at this point.

The next step is to get a proper GUI application working. As usual with any GUI-related matter, we'd really appreciate some help :)

The project is young and relying on reverse-engineered protocols means there are still a few rough edges. Right now, the Bamboo Spark and Slate are supported because we have access to those. The Folio should work, it looks like it's a re-packaged Slate. Intuos Pro Paper support is still pending, we don't have access to a device at this point. If you're interested in testing or helping out, come on over to the github site and get started!

[1] tuhi: Maori for "writing, script"
[2] kete: Maori for "kit"

January 26, 2018

We launched PipeWire last September with this blog entry. I thought it would be interesting for people to hear about the latest progress on what I believe is going to be a gigantic step forward for the Linux desktop. So I caught up with Pipewire creator Wim Taymans during DevConf 2018 in Brno where Wim is doing a talk about Pipewire and we discussed the current state of the code and Wim demonstrated a few of the things that PipeWire now can do.

Christian Schaller and Wim Taymans testing PipeWire with Cheese

Christian Schaller and Wim Taymans testing PipeWire with Cheese

Priority number 1: video handling

So as we said when we launched the top priority for PipeWire is to address our needs on the video side of multimedia. This is critical due to the more secure nature of Wayland, which makes the old methods for screen sharing not work anymore and the emergence of desktop containers in the form of Flatpak. Thus we need PipeWire to help us provide appliation and desktop developers with a new method for doing screen sharing and also to provide a secure way for applications inside a container to access audio and video devices on the system.

There are 3 major challenges PipeWire wants to solve for video. One is device sharing, meaning that multiple applications can share the same video hardware device, second it wants to be able to do so in a secure manner, ensuring your video streams are not highjacked by a rogue process and finally it wants to provide an efficient method for sharing of multimedia between applications, like for instance fullscreen capture from your compositor (like GNOME Shell) to your video conferencing application running in your browser like Google Hangouts, Blue Jeans or Pexip.

So the first thing Wim showed me in action was the device sharing. We launched the GNOME photoboot application Cheese which gets PipeWire support for free thanks to the PipeWire GStreamer plugin. And this is an important thing to remember, thanks to so many Linux applications using GStreamer these days we don’t need to port each one of them to PipeWire, instead the PipeWire GStreamer plugin does the ‘porting’ for us. We then launched a gst-launch command line pipeline in a terminal. The result is two applications sharing the same webcam input without one of them blocking access for the other.

Cheese and GStreamer pipeline running on Pipewiere

As you can see from the screenshot above it worked fine, and this was actually done on my Fedora Workstation 27 system and the only thing we had to do was to start the ‘pipewire’ process in a termal before starting Cheese and the gst-launch pipeline. GStreamer autoplugging took care of the rest. So feel free to try this out yourself if you are interested, but be aware that you will find bugs quickly if you try things like on the fly resolution changes or switching video devices. This is still tech preview level software in Fedora 27.

The plan is for Wim Taymans to sit down with the web browser maintainers at Red Hat early next week and see if we can make progress on supporting PipeWire in Firefox and Chrome, so that conferencing software like the ones mentioned above can start working fully under Wayland.

Since security was one of the drivers for the move to Wayland from X Windows we of course also put a lot of emphasis of not recreating the security holes of X in the compositor. So the way PipeWire now works is that if an application wants to do full screen capture it will check with the compositor through a dbus-api, or a portal in Flatpak and Wayland terminology, and only allows the permited application to do the screen capture, so the stream can’t be highjacked by a random rougue application or process on your computer. This also works from within a sandboxed setting like Flatpaks.

Jack Support

Another important goal of PipeWire was to bring all Linux audio and video together, which means PipeWire needed to be as good or better replacement for Jack for the Pro-Audio usecase. This is a tough usecase to satisfy so while getting the video part has been the top development priority Wim has also worked on verifying that the design allows for the low latency and control needed for Pro-Audio. To do this Wim has implemented the Jack protocol on top of PipeWire.

Carla, a Jack application running on top of PipeWire.


Through that work he has now verified that he is able to achieve the low latency needed for pro-audio with PipeWire and that he will be able to run Jack applications without changes on top of PipeWire. So above you see a screenshot of Carla, a Jack-based application running on top of PipeWire with no Jack server running on the system.

ALSA/Legacy applications

Another item Wim has written the first code for and verfied will work well is the Alsa emulation. The goal of this piece of code is to allow applications using the ALSA userspace API to output to Pipewire without needing special porting or application developer effort. At Red Hat we have many customers with older bespoke applications using this API so it has been of special interest for us to ensure this works just as well as the native ALSA output. It is also worth nothing that Pipewire also does mixing so that sound being routed through ALSA will get seamlessly mixed with audio coming through the Jack layer.

Bluetooth support

The last item Wim has spent some time on since last September is working on making sure Bluetooth output works and he demonstrated this to me while we where talking together during DevConf. The Pipewire bluetooth module plugs directly into the Bluez Bluetooth framework, meaning that things like the GNOME Bluetooth control panel just works with it without any porting work needed. And while the code is still quite young, Wim demonstrated pairing and playing music over bluetooth using it to me.

What about PulseAudio?

So as you probably noticed one thing we didn’t mention above is how to deal with PulseAudio applications. Handling this usecase is still on the todo list and the plan is to at least initially just keep PulseAudio running on the system outputing its sound through PipeWire. That said we are a bit unsure how many appliations would actually be using this path because as mentioned above all GStreamer applications for instance would be PipeWire native automatically through the PipeWire GStreamer plugins. And for legacy applications the PipeWire ALSA layer would replace the current PulseAudio ALSA layer as the default ALSA output, meaning that the only applications left are those outputing to PulseAudio directly themselves. The plan would also be to keep the PulseAudio ALSA device around so if people want to use things like the PulseAudio networked audio functionality they can choose the PA ALSA device manually to be able to keep doing so.
Over time the goal would of course be to not have to keep the PulseAudio daemon around, but dropping it completely is likely to be a multiyear process with current plans, so it is kinda like XWayland on top of Wayland.

Summary

So you might read this and think, hey if all this work we are almost done right? Well unfortunately no, the components mentioned here are good enough for us to verify the design and features, but they still need a lot of maturing and testing before they will be in a state where we can consider switching Fedora Workstation over to using them by default. So there are many warts that needs to be cleaned up still, but a lot of things have become a lot more tangible now than when we last spoke about PipeWire in September. The video handling we hope to enable in Fedora Workstation 28 as mentioned, while the other pieces we will work towards enabling in later releases as the components mature.
Of course the more people interesting in joining the PipeWire community to help us out, the quicker we can mature these different pieces. So if you are interested please join us in #pipewire on irc.freenode.net or just clone the code of github and start hacking. You find the details for irc and git here.

January 15, 2018

I got past first triangle (in simulation) on V3D 4.1, and got texturing working as well. The big feature was to enable multithreaded fragment shaders, which I first did on V3D 3.3. Once I had that, it was easy to port over to 4.1 and get my first fragment shader running. Other tasks this week:

  • Ported the command list and state dumping code to 4.1
  • Added support for V3D 4.1’s arbitrary register writes in varying loads and texturing
  • Ported the other state emits besides the draw calls
  • Enabled texturing on V3D 4.1 (border color swizzling still broken)
  • Fixed infinite loops in shaders without all the channels active
  • Fixed gl_FragCoord pixel center setup
  • Fixed overflows of the tile state array
  • Started writing tools for debugging GPU hangs

This is all pushed to Mesa master now.

January 08, 2018

For VC5 last week I pushed to get first triangle on V3D 4.1. That involved:

  • Updating the build system to build components of the driver for 3.3 and 4.1
  • Encoding of new instructions and signals
  • Updating the compiler to emit the new VPM load/store instructions
  • Porting the RCL for the new TLB load/store commands
  • Porting the drawing commands and shader state packets
  • Porting the simulator wrapper

By Friday I had the simulator running through to the end of a fragment shader, at which point it complained that I hadn’t done the threading right. On this chip, you only get 2 or 4-way threaded programs, while I only had support for single-threaded programs. Given how important multithreading is for latency hiding, next week I’ll be going back to add it to V3D 3.3 and then port it forward to 4.1.

January 07, 2018

Last year, just before the holidays Benjamin Tissoires and I worked on a 'new' project - libevdev-python. This is, unsurprisingly, a Python wrapper to libevdev. It's not exactly new since we took the git tree from 2016 when I was working on it the first time round but this time we whipped it into a better shape. Now it's at the point where I think it has the API it should have, pythonic and very easy to use but still with libevdev as the actual workhorse in the background. It's available via pip3 and should be packaged for your favourite distributions soonish.

Who is this for? Basically anyone who needs to work with the evdev protocol. While C is still a thing, there are many use-cases where Python is a much more sensible choice. The libevdev-python documentation on ReadTheDocs provides a few examples which I'll copy here, just so you get a quick overview. The first example shows how to open a device and then continuously loop through all events, searching for button events:


import libevdev

fd = open('/dev/input/event0', 'rb')
d = libevdev.Device(fd)
if not d.has(libevdev.EV_KEY.BTN_LEFT):
print('This does not look like a mouse device')
sys.exit(0)

# Loop indefinitely while pulling the currently available events off
# the file descriptor
while True:
for e in d.events():
if not e.matches(libevdev.EV_KEY):
continue

if e.matches(libevdev.EV_KEY.BTN_LEFT):
print('Left button event')
elif e.matches(libevdev.EV_KEY.BTN_RIGHT):
print('Right button event')
The second example shows how to create a virtual uinput device and send events through that device:

import libevdev
d = libevdev.Device()
d.name = 'some test device'
d.enable(libevdev.EV_REL.REL_X)
d.enable(libevdev.EV_REL.REL_Y)
d.enable(libevdev.EV_KEY.BTN_LEFT)
d.enable(libevdev.EV_KEY.BTN_MIDDLE)
d.enable(libevdev.EV_KEY.BTN_RIGHT)

uinput = d.create_uinput_device()
print('new uinput test device at {}'.format(uinput.devnode))
events = [InputEvent(libevdev.EV_REL.REL_X, 1),
InputEvent(libevdev.EV_REL.REL_Y, 1),
InputEvent(libevdev.EV_SYN.SYN_REPORT, 0)]
uinput.send_events(events)
And finally, if you have a textual or binary representation of events, the evbit function helps to convert it to something useful:

>>> import libevdev
>>> print(libevdev.evbit(0))
EV_SYN:0
>>> print(libevdev.evbit(2))
EV_REL:2
>>> print(libevdev.evbit(3, 4))
ABS_RY:4
>>> print(libevdev.evbit('EV_ABS'))
EV_ABS:3
>>> print(libevdev.evbit('EV_ABS', 'ABS_X'))
ABS_X:0
>>> print(libevdev.evbit('ABS_X'))
ABS_X:0
The latter is particularly helpful if you have a script that needs to analyse event sequences and look for protocol bugs (or hw/fw issues).

More explanations and details are available in the libevdev-python documentation. That doc also answers the question why libevdev-python exists when there's already a python-evdev package. The code is up on github.

January 03, 2018

For VC5 features:

  • Fixed bugs in shader storage buffer objects / atomic counters.
  • Added partial shader_image_load_store support.
  • Started investigating how to support compute shaders
  • More progress on vulkan image layouts.
  • Updated to a current version of the SW simulator, fixed bugs that that revealed.
  • Reworked TMU output size/channel count setup.
  • Fixed flat shading for >24 varying components.
  • Fixed conditional dicards within control flow (dEQP testcase)
  • Fixed CPU-side tiling for miplevels > 1 (dEQP testcase)
  • Fixed render target setup for RGB10_A2UI (GPU hangs)
  • Started on V3D 4.1 support.

For shader_image_load_store, we (like some cases of Intel) have to do manual tiling address decode and format conversion based on top of SSBO accesses, and (like freedreno and some cases of Intel) want to use normal texture accesses for plain loads. I’ve started on a NIR pass that will split apart the tiling math from the format conversion math so that we can all hopefully share some code here. Some tests are already passing.

The compute shaders are interesting on this hardware. There’s no custom hardware support for them. Instead, you can emit a certain series of line primitives to the fragment shader such that you get shader instances spawned on all the QPUs in the right groups. It’s not pretty, but it means that the infrastructure is totally shared.

On the VC4 front, I got a chance to try out Boris’s performance counter work, using 3DMMES as a testcase. He found frameretrace really hard to work on, and so we don’t have a port of it yet (the fact that porting is necessary seems like a serious architectural problem). However, I was able to use things like “apitrace –pdrawcalls=GL_AMD_performance_monitor:QPU-total-clk-cycles-waiting-varyings” to poke at the workload.

There are easy ways to be led astray with the performance counter support on a tiling GPU (since we flush the frame at each draw call, the GPU spends all its time loading/storing the frame buffer instead of running shaders, so idle clock cycles are silly to look at the draw call level). However, being able to look at things like cycles spent in the shaders of each draw call let us approximate the total time spent in each shader, to direct optimization work.

December 29, 2017
We've been passing the vulkan conformance test suite 1.0.2 mustpass list on radv for quite a while now on the CIK/VI/Polaris cards. However Vega hadn't achieved the same pass rate.





With a bunch of fixes I pushed this morning and one fix for all GPUs, we now have the same pass rate on all GPUs and 0 fails.

This means Vega on radv can now be submitted for conformance under Vulkan 1.0,  not sure when I'll get time to do the paperwork, maybe early next year sometime.


December 19, 2017

pexels-photo-engineer

Since joining an enterprise (the world’s largest business-travel company) 6 months ago to drive their DevOps transformation, my ongoing mental evolution regarding the value of technology has gone through an almost religious rebirth. I now think in a completely different way than I did 10 years ago about what technology is important and when you need it. If you want to become a 10x engineer, you need a different perspective than just working on things because they seem cool. It’s about working toward the right outcomes, whereas most of us focus on the inputs (what tech you use, how many hours you work).

It all comes down to business value. You need to contribute to one of the core factors of business value, or however incredible the technology is, it just doesn’t make a difference. If you don’t know what that really means, you’re not alone — most of the technologists I know have trouble articulating the business model of their employers.

I think about it as 4 primary factors:

  1. Money. This comes in two flavors. First, you’re creating new efficiency, which increases the profit margin. This could either be through lowering the underlying fixed costs of running the business, or decreasing the cost of goods/services sold by saving a little money on every one. Second, you’re increasing sales, which grows overall revenue. In a cost center within a larger enterprise, or in saturated markets, the former is the most common mode of operation because it’s hard to capture new opportunities. In the latter, it’s about growth mode – investing to capture new value, and often assuming you can make it profitable later. This could be framed as “land and expand” or with the assumption that your company will increase the price and margin once it’s gained a sufficient market share to do so with lower risk. Do you understand your company’s business model? Where does the money come from, who are the customers, what are their needs, what is the sales process and cycle, and what are they buying?
  2. Speed. Again, there’s a couple of versions of this that overlap. The overall goals are either initial time to market or speed of iteration. Time to market can come at the expense of significant technical debt, while long-term accelerated iteration cycles are about product-market fit. If you know of the Lean Startup approach promoted by Eric Ries, this should sound familiar. From a long-term perspective, iteration cycles require a balanced approach of customer perspective and technical debt. Otherwise, your company can’t deliver value to customers quickly due to accruing interest on its tech debt. In practice, this can drive an approach that involves gradual refactors with the assumption that long-term rewrites (or e.g. strangler pattern) will be required. It’s the classic “design for 10x but rewrite before 100x,” to paraphrase Google’s Jeff Dean.
  3. Risk. As before, this essentially boils down to executing on new opportunity or loss to existing opportunity. Dan McKinley has a fantastic post on why you should choose boring technology, because the important risks are in the business model vs the tech. You should only make a small number of bets on new technology when it will really make a difference in your ability to deliver on business value. For existing opportunity, it’s more about risk avoidance. Typical approaches tend to end up in some mainframe application that one nearly retirement-age developer knows but is afraid to touch. However, a more sustainable model is to implement heavy automation if it truly is a business-critical application that justifies the investment. Relatedly, risk avoidance is where security shines. One of my favorite perspectives is Google’s BeyondCorp model, which assumes your perimeter is compromised and acts accordingly.
  4. Strategy. Often not immediately visible in the above approaches, investing in strategic growth opportunities is consistently a great path to success in your business. Do you know your company’s strategy? They probably have posters up and meetings about it all the time. Could you say it out loud? Do you know how it maps to concrete actions? Although any individual opportunity may fail, your contribution to executing on the technology behind that opportunity will not go unnoticed. Similarly, if you’re involved in divesting from areas your employer wants to leave as part of its strategy, you have a real but often smaller opportunity to leave your mark upon the work.

Although many other factors have an impact upon business value, those are 4 of the most important ones that can make you consistently successful as a technologist. The key is to understand which ones play into your work, so you can act accordingly in your day-to-day efforts and as part of your career strategy. Are you building software for a cost center, a growth incubator, a risk center, or at a company that cares to invest in speed? Taking full advantage of this approach could make you the 10x engineer you’ve always wanted to be. Best of luck in your journey, and may you spend time where it matters!


Having spent 20 years of my life on Desktop Linux I thought I should write up my thinking about why we so far hasn’t had the Linux on the Desktop breakthrough and maybe more importantly talk about the avenues I see for that breakthrough still happening. There has been a lot written of this over the years, with different people coming up with their explanations. My thesis is that there really isn’t one reason, but rather a range of issues that all have contributed to holding the Linux Desktop back from reaching a bigger market. Also to put this into context, success here in my mind would be having something like 10% market share of desktop systems, that to me means we reached critical mass. So let me start by listing some of the main reasons I see for why we are not at that 10% mark today before going onto talking about how I think that goal might possible to reach going forward.

Things that have held us back

  • Fragmented market
  • One of the most common explanations for why the Linux Desktop never caught on more is the fragmented state of the Linux Desktop space. We got a large host of desktop projects like GNOME, KDE, Enlightenment, Cinnamon etc. and a even larger host of distributions shipping these desktops. I used to think this state should get a lot of the blame, and I still believe it owns some of the blame, but I have also come to conclude in recent years that it is probably more of a symptom than a cause. If someone had come up with a model strong enough to let Desktop Linux break out of its current technical user niche then I am now convinced that model would easily have also been strong enough to leave the Linux desktop fragmentation behind for all practical purposes. Because at that point the alternative desktops for Linux would be as important as the alternative MS Windows shells are. So in summary, the fragmentation hasn’t helped for sure and is still not helpful, but it is probably a problem that has been overstated.

  • Lack of special applications
  • Another common item that has been pointed to is the lack of applications. We know that for sure in the early days of Desktop Linux the challenge you always had when trying to convince anyone of moving to Desktop Linux was that they almost invariably had one or more application they relied on that was only available on Windows. I remember in one of my first jobs after University when I worked as a sysadmin we had a long list of these applications that various parts of the organization relied on, be that special tools to interface with a supplier, with the bank, dealing with nutritional values of food in the company cafeteria etc. This is a problem that has been in rapid decline for the last 5-10 years due to the move to web applications, but I am sure that in a given major organization you can still probably find a few of them. But between the move to the web and Wine I don’t think this is a major issue anymore. So in summary this was a major roadblock in the early years, but is a lot less of an impediment these days.

  • Lack of big name applications
  • Adopting a new platform is always easier if you can take the applications you are familiar with you. So the lack of things like MS Office and Adobe Photoshop would always contribute to making a switch less likely. Just because in addition to switching OS you would also have to learn to use new tools. And of course along those lines there where always the challenge of file format compatibility, in the early days in a hard sense that you simply couldn’t reliably load documents coming from some of these applications, to more recently softer problems like lack of metrically identical fonts. The font for example issue has been mostly resolved due to Google releasing fonts metrically compatible with MS default fonts a few years ago, but it was definitely a hindrance for adoption for many years. The move to web for a lot of these things has greatly reduced this problem too, with organizations adopting things like Google Docs at rapid pace these days. So in summary, once again something that used to be a big problem, but which is at least a lot less of a problem these days, but of course there are still apps not available for Linux that does stop people from adopting desktop linux.

  • Lack of API and ABI stability
  • This is another item that many people have brought up over the years. I think I have personally vacillated over the importance of this one multiple times over the years. Changing APIs are definitely not a fun thing for developers to deal with, it adds extra work often without bringing direct benefit to their application. Linux packaging philosophy probably magnified this problem for developers with anything that could be split out and packaged separately was, meaning that every application was always living on top of a lot of moving parts. That said the reason I am sceptical to putting to much blame onto this is that you could always find stable subsets to rely on. So for instance if you targeted GTK2 or Qt back in the day and kept away from some of the more fast moving stuff offered by GNOME and KDE you would not be hit with this that often. And of course if the Linux Desktop market share had been higher then people would have been prepared to deal with these challenges regardless, just like they are on other platforms that keep changing and evolving quickly like the mobile operating systems.

  • Apple resurgence
  • This might of course be the result of subjective memory, but one of the times where it felt like there could have been a Linux desktop breakthrough was at the same time as Linux on the server started making serious inroads. The old Unix workstation market was coming apart and moving to Linux already, the worry of a Microsoft monopoly was at its peak and Apple was in what seemed like mortal decline. There was a lot of media buzz around the Linux desktop and VC funded companies was set up to try to build a business around it. Reaching some kind of critical mass seemed like it could be within striking distance. Of course what happened here was that Steve Jobs returned to Apple and we suddenly had MacOSX come onto the scene taking at least some air out of the Linux Desktop space. The importance of this one I do find exceptionally hard to quantify though, part of me feels it had a lot of impact, but on the other hand it isn’t 100% clear to me that the market and the players at the time would have been able to capitalize even if Apple had gone belly-up.

  • Microsoft aggressive response
  • In the first 10 years of Desktop linux there was no doubt that Microsoft was working hard to try to nip any sign of Desktop Linux gaining any kind of foothold or momentum. I do remember for instance that Novell for quite some time was trying to establish a serious Desktop Linux business after having bought Miguel de Icaza’s company Helix Code. However it seemed like a pattern quickly emerged that every time Novell or anyone else tried to announce a major Linux desktop deal, Microsoft came running in offering next to free Windows licensing to get people to stay put. Looking at Linux migrations even seemed like it became a goto policy for negotiating better prices from Microsoft. So anyone wanting to attack the desktop market with Linux would have to contend with not only market inertia, but a general depression of the price of a desktop operating systems, and knowing that Microsoft would respond to any attempt to build momentum around Linux desktop deals with very aggressive sales efforts. So in summary, this probably played an important part as it meant that the pay per copy/subscription business model that for instance Red Hat built their server business around became really though to make work in the desktop space. Because the price point ended up so low it required gigantic volumes to become profitable, which of course is a hard thing to quickly achieve when fighting against an entrenched market leader. So in summary Microsoft in some sense successfully fended of Linux breaking through as a competitor although it could be said they did so at the cost of fatally wounding the per copy fee business model they built their company around and ensured that the next wave of competitors Microsoft had to deal with like iOS and Android based themselves on business models where the cost of the OS was assumed to be zero, thus contributing to the Windows Phone efforts being doomed.

  • Piracy
  • One of the big aspirations of the Linux community from the early days was the idea that a open source operating system would enable more people to be able to afford running a computer and thus take part in the economic opportunities that the digital era would provide. For the desktop space there was always this idea that while Microsoft was entrenched in North America and Europe there was this ocean of people in the rest of the world that had never used a computer before and thus would be more open to adopting a desktop linux system. I think this so far panned out only in a limited degree, where running a Linux distribution has surely opened job and financial opportunities for a lot of people, yet when you look at things from a volume perspective most of these potential Linux users found that a pirated Windows copy suited their needs just as much or more. As an anecdote here, there was recently a bit of noise and writing around the sudden influx of people on Steam playing Player Unknown: Battlegrounds, as it caused the relatively Linux marketshare to decline. So most of these people turned out to be running Windows in Mandarin language. Studies have found that about 70% of all software in China is unlicensed so I don’t think I am going to far out on a limb here assuming that most of these gamers are not providing Microsoft with Windows licensing revenue, but it does illustrate the challenge of getting these people onto Linux as they already are getting an operating system for free. So in summary, in addition to facing cut throat pricing from Microsoft in the business sector one had to overcome the basically free price of pirated software in the consumer sector.

  • Red Hat mostly stayed away
  • So few people probably don’t remember or know this, but Red Hat was actually founded as a desktop Linux company. The first major investment in software development that Red Hat ever did was setting up the Red Hat Advanced Development Labs, hiring a bunch of core GNOME developers to move that effort forward. But when Red Hat pivoted to the server with the introduction of Red Hat Enterprise Linux the desktop quickly started playing second fiddle. And before I proceed, all these events where many years before I joined the company, so just as with my other points here, read this as an analysis of someone without first hand knowledge. So while Red Hat has always offered a desktop product and have always been a major contributor to keeping the Linux desktop ecosystem viable, Red Hat was focused on the server side solutions and the desktop offering was always aimed more narrowly things like technical workstation customers and people developing towards the RHEL server. It is hard to say how big an impact Red Hats decision to not go after this market has had, on one side it would probably have been beneficial to have the Linux company with the deepest pockets and the strongest brand be a more active participant, but on the other hand staying mostly out of the fight gave other companies a bigger room to give it a go.

  • Canonical business model not working out
  • This bullet point is probably going to be somewhat controversial considering I work for Red Hat (although this is my private blog my with own personal opinions), but on the other hand I feel one can not talk about the trajectory of the Linux Desktop over the last decade without mentioning Canonical and Ubuntu. So I have to assume that when Mark Shuttleworth was mulling over doing Ubuntu he probably saw a lot of the challenges that I mention above, especially the revenue generation challenges that the competition from Microsoft provided. So in the end he decided on the standard internet business model of the time, which was to try to quickly build up a huge userbase and then dealing with how to monetize it later on. So Ubuntu was launched with an effective price point of zero, in fact you could even get install media sent to you for free. The effort worked in the sense that Ubuntu quickly became the biggest player in the Linux desktop space and it certainly helped the Linux desktop marketshare grow in the early years. Unfortunately I think it still basically failed, and the reason I am saying that is that it didn’t manage to grow big enough to provide Ubuntu with enough revenue through their appstore or their partner agreements to allow them to seriously re-invest in the Linux Desktop and invest in the kind of marketing effort needed to take Linux to a less super technical audience. So once it plateaued what they had was enough revenue to keep what is a relatively barebones engineering effort going, but not the kind of income that would allow them to steadily build the Linux Desktop market further. Mark then tried to capitalize on the mindshare and market share he had managed to build, by branching out into efforts like their TV and Phone efforts, but all those efforts eventually failed.
    It would probably be an article in itself to deeply discuss why the grow userbase strategy failed here vs why for instance Android succeeded with this model, but I think the short version goes back to the fact that you had an entrenched market leader and the Linux Desktop isn’t different enough from a Mac or Windows desktops to drive the type of market change the transition from feature phones to smartphones was.
    And to be clear I am not criticizing Mark here for the strategy he choose, if I where in his shoes back when he started Ubuntu I am not sure I would have been able to come up a different strategy that would have been plausible to succeed from his starting point. That said it did contribute to even further push the expected price of desktop Linux down and thus making it even harder for people to generate significant revenue from desktop linux. On the other hand one can argue that this would likely have happened anyway due to competitive pressure and Windows piracy. Canonicals recent focus pivot away from the desktop towards trying to build a business in the server and IoT space is in some sense a natural consequence of hitting the desktop growth plateau and not having enough revenue to invest in further growth.
    So in summary, what was once seen as the most likely contender to take the Linux Desktop to critical mass turned out to have taken off with to little rocket fuel and eventually gravity caught up with them. And what we can never know for sure is if they during this run sucked so much air out of the market that it kept someone who could have taken us further with a different business model from jumping in.

  • Original device manufacturer support
  • THis one is a bit of a chicken and egg issue. Yes, lack of (perfect) hardware support has for sure kept Linux back on the Desktop, but lack of marketshare has also kept hardware support back. As with any system this is a question of reaching critical mass despite your challenges and thus eventually being so big that nobody can afford ignoring you. This is an area where we even today are still not fully there yet, but which I do feel we are getting closer all the time. When I installed Linux for the very first time, which I think was Red Hat Linux 3.1 (pre RHEL days) I spent about a weekend fiddling just to get my sound card working. I think I had to grab a experimental driver from somewhere and compile it myself. These days I mostly expect everything to work out of the box except more unique hardware like ambient light sensors or fingerprint readers, but even such devices are starting to land, and thanks to efforts from vendors such as Dell things are looking pretty good here. But the memory of these issues is long so a lot of people, especially those not using Linux themselves, but have heard about Linux, still assume hardware support is a very much hit or miss issue still.

What does the future hold?

So any who has read my blog posts probably know I am an optimist by nature. This isn’t just some kind of genetic disposition towards optimism, but also a philosophical belief that optimism breeds opportunity while pessimism breeds failure. So just because we haven’t gotten the Linux Desktop to 10% marketshare so far doesn’t mean it will not happen going forward. It just means we haven’t achieved it so far. One of the key identifies of open source is that it is incredibly hard to kill, because unlike proprietary software, just because a company goes out of business or decides to shut down a part of its business, the software doesn’t go away or stop getting developed. As long as there is a strong community interested in pushing it forward it remains and evolves and thus when opportunity comes knocking again it is ready to try again. And that is definitely true of Desktop Linux which from a technical perspective is better than it has ever been, the level of polish is higher than ever before, the level of hardware support is better than ever before and the range of software available is better than ever before.

And the important thing to remember here is that we don’t exist in a vacuum, the world around us constantly change too, which means that the things that blocked us in the past or the companies that blocked us in the past might no be around or able to block us tomorrow. Apple and Microsoft are very different companies today than they where 10 or 20 years ago and their focus and who they compete with are very different. The dynamics of the desktop software market is changing with new technologies and paradigms all the time. Like how online media consumption has moved from things like your laptop to phones and tablets for instance. 5 years ago I would have considered iTunes a big competitive problem, today the move to streaming services like Spotify, Hulu, Amazon or Netflix has made iTunes feel archaic and a symbol of bygone times.

And many of the problems we faced before, like weird Windows applications without a Linux counterpart has been washed away by the switch to browser based applications. And while Valve’s SteamOS effort didn’t taken off, it has provided Linux users with access to a huge catalog of games, removing a reason that I know caused a few of my friends to mostly abandon using Linux on their computers. And you can actually as a consumer buy linux from a range of vendors now, who try to properly support Linux on their hardware. And this includes a major player like Dell and smaller outfits like System76 and Purism.

And since I do work for Red Hat managing our Desktop Engineering team I should address the question of if Red Hat will be a major driver in taking Desktop linux to that 10%? Well Red Hat will continue to support end evolve our current RHEL Workstation product, and we are seeing a steady growth of new customers for it. So if you are looking for a solid developer workstation for your company you should absolutely talk to Red Hat sales about RHEL Workstation, but Red Hat is not looking at aggressively targeting general consumer computers anytime soon. Caveat here, I am not a C-level executive at Red Hat, so I guess there is always a chance Jim Whitehurst or someone else in the top brass is mulling over a gigantic new desktop effort and I simply don’t know about it, but I don’t think it is likely and thus would not advice anyone to hold their breath waiting for such a thing to be announced :). That said Red Hat like any company out there do react to market opportunities as they arise, so who knows what will happen down the road. And we will definitely keep pushing Fedora Workstation forward as the place to experience the leading edge of the Desktop Linux experience and a great portal into the world of Linux on servers and in the cloud.

So to summarize; there are a lot of things happening in the market that could provide the right set of people the opportunity they need to finally take Linux to critical mass. Whether there is anyone who has the timing and skills to pull it off is of course always an open question and it is a question which will only be answered the day someone does it. The only thing I am sure of is that Linux community are providing a stronger technical foundation for someone to succeed with than ever before, so the question is just if someone can come up with the business model and the market skills to take it to the next level. There is also the chance that it will come in a shape we don’t appreciate today, for instance maybe ChromeOS evolves into a more full fledged operating system as it grows in popularity and thus ends up being the Linux on the Desktop end game? Or maybe Valve decides to relaunch their SteamOS effort and it provides the foundation for a major general desktop growth? Or maybe market opportunities arise that will cause us at Red Hat to decide to go after the desktop market in a wider sense than we do today? Or maybe Endless succeeds with their vision for a Linux desktop operating system? Or maybe the idea of a desktop operating system gets supplanted to the degree that we in the end just sit there saying ‘Alexa, please open the IDE and take dictation of this new graphics driver I am writing’ (ok, probably not that last one ;)

And to be fair there are a lot of people saying that Linux already made it on the desktop in the form of things like Android tablets. Which is technically correct as Android does run on the Linux kernel, but I think for many of us it feels a bit more like a distant cousin as opposed to a close family member both in terms of use cases it targets and in terms of technological pedigree.

As a sidenote, I am heading of on Yuletide vacation tomorrow evening, taking my wife and kids to Norway to spend time with our family there. So don’t expect a lot new blog posts from me until I am back from DevConf in early February. I hope to see many of you at DevConf though, it is a great conference and Brno is a great town even in freezing winter. As we say in Norway, there is no such thing as bad weather, it is only bad clothing.

December 18, 2017

sign-big-150dpi-magnified-name-200x200I’m sad to say it’s the end of the road for me with Gentoo, after 13 years volunteering my time (my “anniversary” is tomorrow). My time and motivation to commit to Gentoo have steadily declined over the past couple of years and eventually stopped entirely. It was an enormous part of my life for more than a decade, and I’m very grateful to everyone I’ve worked with over the years.

My last major involvement was running our participation in the Google Summer of Code, which is now fully handed off to others. Prior to that, I was involved in many things from migrating our X11 packages through the Big Modularization and maintaining nearly 400 packages to serving 6 terms on the council and as desktop manager in the pre-council days. I spent a long time trying to change and modernize our distro and culture. Some parts worked better than others, but the inertia I had to fight along the way was enormous.

No doubt I’ve got some packages floating around that need reassignment, and my retirement bug is already in progress.

Thanks, folks. You can reach me by email using my nick at this domain, or on Twitter, if you’d like to keep in touch.


Tagged: gentoo, x.org
December 15, 2017

So I spent a few hours polishing my crystal ball today, so here are some predictions for Linux on the Desktop in 2018. The advantage of course for me to publish these now is that I can then later selectively quote the ones I got right to prove my brilliance and the internet can selectively quote the ones I got wrong to prove my stupidity :)

Prediction 1: Meson becomes the defacto build system of the Linux community

Meson has been going from strength to strength this year and a lot of projects
which passed on earlier attempts to replace autotools has adopted it. I predict this
trend will continue in 2018 and that by the end of the year everyone agrees that Meson
has replaced autotools as the Linux community build system of choice. That said I am not
convinced the Linux kernel itself will adopt Meson in 2018.

Prediction 2: Rust puts itself on a clear trajectory to replace C and C++ for low level programming

Another rising start of 2017 is the programming language Rust. And while its pace of adoption
will be slower than Meson I do believe that by the time 2018 comes to a close the general opinion is
that Rust is the future of low level programming, replacing old favorites like C and C++. Major projects
like GNOME and GStreamer are already adopting Rust at a rapid pace and I believe even more projects will
join them in 2018.

Prediction 3: Apples decline as a PC vendor becomes obvious

Ever since Steve Jobs died it has become quite clear in my opinion that the emphasis
on the traditional desktop is fading from Apple. The pace of hardware refreshes seems
to be slowing and MacOS X seems to be going more and more stale. Some pundits have already
started pointing this out and I predict that in 2018 Apple will be no longer consider the
cool kid on the block for people looking for laptops, especially among the tech savvy crowd.
Hopefully a good opportunity for Linux on the desktop to assert itself more.

Prediction 4: Traditional distro packaging for desktop applications
will start fading away in favour of Flatpak

From where I am standing I think 2018 will be the breakout year for Flatpak as a replacement
for gettings your desktop applications as RPMS or debs. I predict that by the end of 2018 more or
less every Linux Desktop user will be at least running 1 flatpak on their system.

Prediction 5: Linux Graphics competitive across the board

I think 2018 will be a breakout year for Linux graphics support. I think our GPU drivers and API will be competitive with any other platform both in completeness and performance. So by the end of 2018 I predict that you will see Linux game ports by major porting houses
like Aspyr and Feral that perform just as well as their Windows counterparts. What is more I also predict that by the end of 2018 discreet graphics will be considered a solved problem on Linux.

Prediction 6: H265 will be considered a failure

I predict that by the end of 2018 H265 will be considered a failed codec effort and the era of royalty bearing media codecs will effectively start coming to and end. H264 will be considered the last successful royalty bearing codec and all new codecs coming out will
all be open source and royalty free.

In the midst of post-release bug fixing, we've also added a fair number of new features to our stack. As usual, new features span a number of different components, so integrators will have to be careful picking up all the components when, well, integrating.

PS3 clones joypads support

Do you have a PlayStation 3 joypad that feels just a little bit "off"? You can't find the Sony logo anywhere on it? The figures on the face buttons look like barbed wire? And if it were a YouTube video, it would say "No copyright intended"?


Bingo. When plugged in via USB, those devices advertise themselves as SHANWAN or Gasia, and implement the bare minimum to work when plugged into a PlayStation 3 console. But as a Linux computer would behave slightly differently, we need to fix a couple of things.

The first fix was simple, but necessary to be able to do any work: disable the rumble motor that starts as soon as you plug the pad through USB.

Once that's done, we could work around the fact that the device isn't Bluetooth compliant, and hard-code the HID service it's supposed to offer.

Bluetooth LE Battery reporting

Bluetooth Low Energy is the new-fangled (7-year old) protocol for low throughput devices, from a single coin-cell powered sensor, to input devices. What's great is that there's finally a standardised way for devices to export their battery statuses. I've added support for this in BlueZ, which UPower then picks up for desktop integration goodness.

There are a number of Bluetooth LE joypads available for pickup, including a few that should be firmware upgradeable. Look for "Bluetooth 4" as well as "Bluetooth LE" when doing your holiday shopping.

gnome-bluetooth work

Finally, this is the boring part. Benjamin and I reworked code that's internal to gnome-bluetooth, as used in the Settings panel as well as the Shell, to make it use modern facilities like GDBusObjectManager. The overall effect of this is, less code, less brittle and more reactive when Bluetooth adapters come and go, such as when using airplane mode.

Apart from the kernel patch mentioned above (you'll know if you need it :), those features have been integrated in UPower 0.99.7 and in the upcoming BlueZ 5.48. And they will of course be available in Fedora, both in rawhide and as updates to Fedora 27 as soon as the releases have been done and built.

GG!
December 11, 2017

It’s been a while since I posted a TWIV update, so this one will be big:

For VC5 GL features:

  • Implemented shader storage buffer objects
  • Fixed 1D texture mipmapping
  • Worked on 3D texture mipmapping again.
  • Reworked the Z32F_S8 support based on feedback from Rob Clark, and then rebuilt on his alternative Z32F_S8 patch.
  • Fixed GL_CLAMP with texture offsets in NIR lowering.
  • Added support for textureGrad().
  • Fixed textureLod() GL_BASE_LEVEL handling.
  • Fixed sampling from array textures.
  • Fixed pausing of transform feedback primitive queries for meta-ops.
  • Fixed incorrect padding in transform feedback output.
  • Fixed ARB_framebuffer_object mismatching RB size handling.

While running DEQP tests on all this (which unfortunately don’t complete yet due to running out of memory on my 7268 without swap), I’ve also rebased my Vulkan series and started on implementing image layout for it.

I also tested Timothy Arceri’s gallium NIR linking pass. The goal of that is to pack and dead-code eliminate varyings up in shared code. It’s a net ~0 effect on vc4 currently, but it will help vc5, and I may be able to dead-code eliminate some of the vc4 compiler backend now that the IR coming in to the driver is cleaner.

On the VC4 front, Boris has posted a series for performance counter support. This was a pretty big piece of work, and our hope is that with the addition of performance counters we’ll be able to dig into those workloads where vc4 is slower than the closed driver and actually fix them. Unfortunately he hasn’t managed to build frameretrace yet, so we haven’t really tested it on its final intended workload.

For VC4 GL, I did a bit of work on minetest performance, improving the game’s fps from around 15 to around 17. Its desktop GL renderer is really unfortunate, using a lot of immediate-mode GL, but I was completely unable to get its GLES renderer branch to build. It also lacks a reproducable/scriptable benchmark mode, so most of my testing was against an apitrace, which is very hard to get useful performance data from.

I debugged a crash in vc4 with large vertex counts that a user had reported, landed a fix for a kernel memory leak, and landed Dave Stevenson’s HVS format support (part of his work on getting video decode into vc4 GL).

Finally, I did a bit of research and work to help unblock Dave Stevenson’s unicam driver (the open source camera driver). Now that we have an ack for the DT binding, we should be able to get it merged for 4.16!

December 06, 2017
A quick post to tell you that we finally added UTC support to Clocks' and the Shell's World Clocks section. And if you're into it, there's also Anywhere on Earth support.

You will need to have git master versions of libgweather (our cities and timezones database), and gnome-clocks. This feature will land in GNOME 3.28.



Many thanks to Giovanni for coming up with an API he was happy with after I attempted a couple of iterations on one. Enjoy!

Update: As expected, a bug crept in. Thanks to Colin Guthrie for spotting the error in the "Anywhere on Earth" timezone. See this section for the fun we have to deal with.
November 28, 2017

Alt text

So let's start off by covering how ChromiumOS relates to ChromeOS. The ChromiumOS project is essentially ChromeOS minus branding and some packages for things like the media digital restrictions management.

But on the whole, almost everything is there, and the pieces that aren't, you don't need.

ChromiumOS

Depot tools

In order to check out ChromiumOS and other large Google projects, you'll need depot tools.

git clone https://chromium.googlesource.com/chromium/tools/depot_tools.git
export PATH=$PATH:$(PWD)/depot_tools

Maybe you'd want to add the PATH export to your .bashrc.

Building ChromiumOS

mkdir chromiumos
cd chromiumos
repo init -u https://chromium.googlesource.com/chromiumos/manifest.git --repo-url https://chromium.googlesource.com/external/repo.git [-g minilayout]
repo sync -j75
cros_sdk
export BOARD=amd64-generic
./setup_board --board …
November 20, 2017

Another series of VC5 GL features this week:

  • Mostly implemented Z32F_S8 texture/FBO support (building new helpers for other drivers that have to do this)
  • Implemented helpers in core gallium for mappings of MSAA buffers.
  • Landed fix for stencil reference values.
  • Landed fix for colormasks for BGRA (not RGBA) FBOs.
  • Landed fix for clear color for BGRA FBOs.
  • Landed fix for 16/32-bit integer texturing.

For VC4, I reviewed and landed a bugfix that would cause kernel oopses in the IRQ handler path for the out-of-memory signal. I think this covered the only known oops in VC4’s 3D.

I also spent a while on the VC4 backport, debugging a regression related to the DSI changes: Now when the panel is disconnected, the VC4 driver won’t load when the DSI panel is present in the overlay. Unfortunately, there aren’t really good solutions for this because in the ARM DT world, the assumption is that your hardware is fixed and you can’t just optionally plug hardware in without doing a bunch of manual editing of your DT. I’m working with the DRM bridge maintainers to come up with a plan.

November 14, 2017

Another series of VC5 GL features this week:

  • Occlusion query support.
  • GL_RASTERIZER_DISCARD support.
  • Transform feedback’s queries mostly supported.
  • Fixed GL_OUT_OF_MEMORY (and OOMing the python runner!) on piglit’s streaming-texture-leak.
  • Fixed 8/16-bit integer texturing.

For VC4, the big news is that we’ve landed Boris’s MADVISE support in Mesa as well now. This means that if you have a 4.15 kernel and the next release of Mesa, the kernel will now be able to clean up the userspace BO cache when we run out of CMA. This doesn’t prevent all GL_OUT_OF_MEMORY errors, but it should reduce the circumstances where you can hit them.

I spent a while putting together a backport of all our kernel development for Raspbian’s rpi-4.9.y branch. So much has happened in DRM in the last year, that it’s getting harder and harder to backport our work. However, the PR I sent brings in fully functional support for the DSI panel (no more purple flickering!) and fix for a longstanding race that could crash the kernel when powering down the GPU (thanks to Stefan Schake for debugging and providing a patch!)

I also fixed the VC4 build and armhf cross-builds with the new meson build system, after Timothy Arceri noted that it wasn’t working on his Pi. I’m now happily using meson for all of my Mesa development.

November 07, 2017
It appears that Ubuntu mesa 17.2.2 packages that ship radv, have patches to enable MIR support. These patches actually just break radv instead. I'd seen some people complain that simple apps don't work on radv, and saying radv wasn't ready for use and how could anyone thing of using it and just wondered what they had been smoking as Fedora was working fine. Hopefully Canonical can sort that out ASAP.
November 06, 2017
Last week I played a bit with crosvm, a KVM monitor used within Chromium OS for application isolation. My goal is to learn more about the current limits of virtualization for isolating applications in mainline. Two of crosvm's defining characteristics is that it's written in Rust for increased security, and that uses namespaces extensively to reduce the attack surface of the monitor itself.

It was quite easy to get it running outside Chromium OS (have been testing with Fedora 26), with the only complication being that minijail isn't widely packaged in distros. In the instructions below we hack around the issue with linker environment variables so we don't have to install it properly. Instructions are in form of shell commands for illustrative purposes only.

Build kernel:
$ cd ~/src
$ git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
$ cd linux
$ git checkout v4.12
$ make x86_64_defconfig
$ make bzImage
$ cd ..
Build minijail:
$ git clone https://android.googlesource.com/platform/external/minijail
$ cd minijail
$ make
$ cd ..
Build crosvm:
$ git clone https://chromium.googlesource.com/a/chromiumos/platform/crosvm
$ cd crosvm
$ LIBRARY_PATH=~/src/minijail cargo build
Generate rootfs:
$ cd ~/src/crosvm
$ dd if=/dev/zero of=rootfs.ext4 bs=1K count=1M
$ mkfs.ext4 rootfs.ext4
$ mkdir rootfs/
$ sudo mount rootfs.ext4 rootfs/
$ debootstrap testing rootfs/
$ sudo umount rootfs/
Run crosvm:
$ LD_LIBRARY_PATH=~/src/minijail ./target/debug/crosvm run -r rootfs.ext4 --seccomp-policy-dir=./seccomp/x86_64/ ~/src/linux/arch/x86/boot/compressed/vmlinux.bin
The work ahead includes figuring out the best way for Wayland clients in the guest interact with the compositor in the host, and also for guests to make efficient use of the GPU.

I spent the last week mostly working on VC5 GL features and bugfixes again.

  • Fixed a crash with ARB fragment programs
  • Fixed sRGBA8 ETC2 support.
  • Fixed piglit early-z on hardware.
  • Fixed lod clamping in the presence of BASE_LEVEL.
  • Added support for anisotropic filtering
  • Fixed mipmap filtering setup in the HW.
  • Fixed padding of small miplevels of UIF textures
  • Fixed GLSL ES 3.0 minimum-maximum values.
  • Fixed GL 2.1 min-max values.
  • Fixed stencil state (and moved it to CSO to reduce draw overhead).
  • Reduced emission of unused GL state.
  • Improved CL debug output.
  • Fixed alignment of texture sampler state.
  • Fixed depth (without stencil) render targets.

However, most of my time was actually spent on trying to track down my remaining GPU hangs. Not many tests hang, but fbo-generatemipmaps is one, and it’s a big component of piglit’s coverage. So far, I’ve definitely figured out that the hanging RCL has run to completion without error, but without the bit in the interrupt status being set (which is supposed to be set by the final command of the RCL). I tracked down that I had my DT wrong and was treating VC5 as edge-triggered rather than level, but that doesn’t seem to have helped.

On the VC4 front, I’ve been talking to someone trying to rig up other DSI panels to the RPi. It got me looking at my implementation again, and I finally found why my DSI transactions weren’t working: I was emitting the wrong type of transactions for the bridge! Switching from a DCS write to a generic write, the panel comes up fine and the flickering is gone.

October 30, 2017

I spent the last week entirely working on VC5 GL features and bugfixes.

  • Enabled MSAA rendering and the general case of resolves
  • Fixed non-2D texturing
  • Fixed GPU hang with no vertex elements used
  • Fixed many subtests of fbo-clear-formats
  • Fixed many subtests of fbo-blending-formats
  • Fixed glGetTexImage() after FBO rendering to RGB10_A2 textures
  • Fixed enabling of stencil tests
  • Fixed fneg(0.0) not producing -0.0
  • Fixed unpack_*_4x8 shader opcodes
  • Fixed filtering of GL_CLAMP (desktop GL compatibility) texture wrap mode
  • Fixed gl_FragCoord pixel center behavior
October 27, 2017

Why doesn't this work automatically?

The firmware blob that is needed by Broadcom devices is not supplied by default, and it has to be supplied manually.

How To

Download BCM-0a5c-6410.hcd and copy it into /lib/firmware/brcm/ and then reboot your device.

wget https://memcpy.io/files/2017-10-28/BCM-0a5c-6410.hcd
sudo cp BCM-0a5c-6410.hcd /lib/firmware/brcm/
sudo chmod 0644 /lib/firmware/brcm/BCM-0a5c-6410.hcd
sudo reboot
October 26, 2017

libwacom has been around since 2011 now but I'm still getting the odd question or surprise at what libwacom does, is, or should be. So here's a short summary:

libwacom only provides descriptions

libwacom is a library that provides tablet descriptions but no actual tablet event handling functionality. Simply said, it's a library that provides axes to a bunch of text files. Graphics tablets are complex and to integrate them well we usually need to know more about them than the information the kernel reports. If you need to know whether the tablet is a standalone one (Wacom Intuos series) or a built-in one (Wacom Cintiq series), libwacom will tell you that. You need to know how many LEDs and mode groups a tablet has? libwacom will tell you that. You need an SVG to draw a representation of the tablet's button layout? libwacom will give you that. You need to know which stylus is compatible with your tablet? libwacom knows about that too.

But that's all it does. You cannot feed events to libwacom, and it will not initialise the device for you. It just provides static device descriptions.

libwacom does not make your tablet work

If your tablet isn't working or the buttons aren't handled correctly, or the stylus is moving the wrong way, libwacom won't be able to help with that. As said above, it merely provides extra information about the device but is otherwise completely ignorant of the actual tablet.

libwacom handles any tablet

Sure, it's named after Wacom tablets because that's where the majority of effort goes (not least because Wacom employs Linux developers!). But the description format is independent of the brand so you can add non-Wacom tablets to it too.

Caveat: many of the cheap non-Wacom tablets re-use USB ids so two completely different devices would have the same USB ID, making a static device description useless.

Who uses libwacom?

Right now, the two most prevalent users of libwacom are GNOME and libinput. GNOME's control center and mutter use libwacom for tablet-to-screen mappings as well as to show the various stylus capabilities. And it uses the SVG to draw an overlay for pad buttons. libinput uses it to associate the LEDs on the pad with the right buttons and to initialise the stylus tools axes correctly. The kernel always exposes all possible axes on the event node but not all styli have all axes. With libwacom, we can initialise the stylus tool based on the correct information.

Resources

So now I expect you to say something like "Oh wow, I'm like totally excited about libwacom now and I want to know more and get involved!". Well, fear not, there is more information and links to the repos in the wiki.

October 24, 2017

Alt text

A recording of the talk can be found here.

Downloads

If you're curious about the slides, you can download the PDF or the OTP.

Thanks

This post has been a part of work undertaken by my employer Collabora.

I would like to thank the wonderful organizers of Embedded Linux Conference EU, for hosting a great community event.

October 23, 2017

The GNOME.Asia Summit 2017 organizers invited to me to speak at their conference in Chongqing/China, and it was an excellent event! Here's my brief report:

Because we arrived one day early in Chongqing, my GNOME friends Sri, Matthias, Jonathan, David and I started our journey with an excursion to the Dazu Rock Carvings, a short bus trip from Chongqing, and an excellent (and sometimes quite surprising) sight. I mean, where else can you see a buddha with 1000+ hands, and centuries old, holding a cell Nexus 5 cell phone? Here's proof:

The GNOME.Asia schedule was excellent, with various good talks, including some about Flatpak, Endless OS, rpm-ostree, Blockchains and more. My own talk was about The Path to a Fully Protected GNOME Desktop OS Image (Slides available here). In the hallway track I did my best to advocate casync to whoever was willing to listen, and I think enough were ;-). As we all know attending conferences is at least as much about the hallway track as about the talks, and GNOME.Asia was a fantastic way to meet the Chinese GNOME and Open Source communities.

The day after the conference the organizers of GNOME.Asia organized a Chongqing day trip. A particular highlight was the ubiqutious hot pot, sometimes with the local speciality: fresh pig brain.

Here some random photos from the trip: sights, food, social event and more.

I'd like to thank the GNOME Foundation for funding my trip to GNOME.Asia. And that's all for now. But let me close with an old chinese wisdom:

   The Trials Of A Long Journey Always Feeling, Civilized Travel Pass Reputation.

For those living under a rock, the videos from everybody's favourite Userspace Linux Conference All Systems Go! 2017 are now available online.

All videos

The videos for my own two talks are available here:

Synchronizing Images with casync (Slides)

Containers without a Container Manager, with systemd (Slides)

Of course, this is the stellar work of the CCC VOC folks, who are hard to beat when it comes to videotaping of community conferences.

Edit: linrunner (TLP author) has been so kind as to make prebuilt Ubuntu kernel packages with the patch available.

My next project for Red Hat is to work on improving Linux laptop battery life. Part of the (hopefully) low hanging fruit here is using kernel tunables to enable more runtime powermanagement. My first target here is SATA Link Power Management (LPM) which, as Matthew Garrett blogged about 2 years ago, can lead to a significant improvement in battery life.

There is only one small problem, there have been some reports that some disks/SSDs don't play well with Linux' min_power LPM policy and that this may lead to system crashes and even data corruption.

Let me repeat this: Enabling SATA LPM may lead to DATA CORRUPTION. So if you want to help with testing this please make sure you have recent backups! Note this happens only in rare cases (likely only with a coupe of specific SSD models with buggy firmware. But still DATA CORRUPTION may happen make sure you have BACKUPS.

As part of his efforts 2 years ago Matthew found this document which describes the LPM policy the Windows Intel Rapid Storage Technology (IRST) drivers use by default and most laptops ship with these drivers installed.

So based on an old patch from Matthew I've written a patch adding support for a new LPM policy called "med_power_with_dipm" to Linux. This saves
(almost) as much power as the min_power setting and since it matches Windows defaults I hope that it won't trip over any SSD/HDD firmware bugs.

So this is where my call for testers comes in, for Fedora 28 we would like to switch to this new SATA LPM policy by default (on laptops at least), but
we need to know that this is safe to do. So we are looking for people to help test this, if you have a laptop with a SATA drive (not NVME) and would like to help please make BACKUPS and then continue reading :)

First of all on a clean Fedora (no powertop --auto-tune, no TLP) do "sudo dnf install powertop", then close all your apps except for 1 terminal, maximimze that terminal and run "sudo powertop".

Now wait 5 minutes, on some laptops the power measurement is a moving average so this is necessary to get a reliable reading. Now look at the
power consumption shown (e.g. 7.95W), watch it for a couple of refreshes as it sometimes spikes when something wakes up to do some work, write down the lowest value you see, this is our base value for your laptops power consumption.

Next install the new kernel and try the new SATA LPM policy. I've done a scratch-build of the Fedora kernel with this patch added, which
you can download here. Linrunner (TLP author) has been so kind as to make prebuilt Ubuntu kernel packages with the patch available.

After downloading all the .x86_64.rpm files there into a dir, do from this dir:
sudo rpm -ivh kernel*.x86_64.rpm

Next download a rc.local script applying the new settings from here, copy it to /etc/rc.d/rc.local, and make it executable: "sudo chmod +x /etc/rc.d/rc.local".

Now reboot and do: "cat /sys/class/scsi_host/host0/link_power_management_policy" this should return med_power_with_dipm, if not something is wrong.

Then close all your apps except for 1 terminal, maximimze that terminal and run "sudo powertop" again. Wait 5 minutes as last time, then get a couple of readings and write down the lowest value you see.

After this continue using your laptop as normal, please make sure that you keep running the special kernel with the patch adding the "med_power_with_dipm" policy. If after 2 weeks you've not noticed any bad side effects (or if you do notice bad side effects earlier) send me a mail at hdegoede@redhat.com with:

  • Report of success or bad side effects

  • The idle powerconsumption before and after the changes

  • The brand and model of your laptop

  • The output of the following commands:

  • cat /proc/cpuinfo | grep "model name"

  • cat /sys/class/scsi_device/*/device/model

I will gather the results in a table which will be part of the to-be-created Fedora 28 Changes page for this.

Did I mention already that although the chance is small something will go wrong, it is non zero and you should create backups ?

Thank you for your time.

This week I mostly spent on the 7268, getting the GL driver stabilized on the actual HW.

First, I implemented basic overflow memory allocation, so we wouldn’t just hang and reset when that condition triggers. This let me complete an entire piglit run on the HW, which is a big milestone.

I also ended up debugging why the GPU reset on overflow wasn’t working before – when we reset we would put the current bin job onto the tail of the binner job list, so we’d just try it again later and hang again if that was the bad job. The intent of the original code had been to move it to the “done” list so it would get cleaned up without executing any more. However, that was also a problem – you’d end up behind by one in your seqnos completed, so BO idle would never work. Instead, I now just move it to the next stage of the execution pipeline with a “hung” flag to say “don’t actually execute anything from this”. This is a bug in vc4 as well, and I need to backport the fix.

Once I had reliable piglit, I found that there was an alignment requirement for default vertex attributes that I hadn’t uncovered in the simulator. By moving them to CSO time, I reduced the draw overhead in the driver and implicitly got the buffer aligned like we needed.

Additionally, I had implemented discards using conditional tile buffer writes, same as vc4. This worked fine on the simulator, but had no effect on the HW. Replacing that with SETMSF usage made the discards work, and probably reduced instruction count.

On the vc4 front, I merged the MADVISE code from Boris just in time for the last pull request for 4.15. I also got in the DSI transactions sleeping fix, so the code should now be ready for people to try hooking up random DSI panels to vc4.

October 19, 2017

So I have over the last few years blogged regularly about upcoming features in Fedora Workstation. Well I thought as we putting the finishing touches on Fedora Workstation 27 I should try to look back at everything we have achieved since Fedora Workstation was launched with Fedora 21. The efforts I highlight here are efforts where we have done significant or most development. There are of course a lot of other big changes that has happened over the last few years by the wider community that we leveraged and offer in Fedora Workstation, examples here include things like Meson and Rust. This post is not about those, but that said I do want to write a post just talking about the achievements of the wider community at some point, because they are very important and crucial too. And along the same line this post will not be speaking about the large number of improvements and bugfixes that we contributed to a long list of projects, like to GNOME itself. This blog is about taking stock and taking some pride in what we achieved so far and major hurdles we past on our way to improving the Linux desktop experience.
This blog is also slightly different from my normal format as I will not call out individual developers by name as I usually do, instead I will focus on this being a totality and thus just say ‘we’.

  • Wayland – We been the biggest contributor since we joined the effort and have taken the lead on putting in place all the pieces needed for actually using it on a desktop, including starting to ship it as our primary offering in Fedora Workstation 25. This includes putting a lot of effort into ensuring that XWayland works smoothly to ensure full legacy application support.
  • Libinput – A new library we created for handling all input under both X and Wayland. This came about due to needing input handling that was not tied to X due to Wayland, but it has even improved input handling for X itself. Libinput is being rapidly developed and improved, with 1.9 coming out just a few days ago.
  • glvnd – Dealing with multiple OpenGL implementations have been a pain under Linux for years. We worked with NVidia on this effort to ensure that you can install multiple OpenGL implementations on the system and have your system be able to use the correct one depending on which GPU and driver you are using. We keep expanding on this solution to cover more usecases, so for Fedora Workstation 27 we expect to bring glvnd support to XWayland for instance.
  • Porting Firefox to GTK3 – We ported Firefox to GTK3, including making sure it works under Wayland. This work also provided the foundation for HiDPI support in Firefox. We are the single biggest contributor to Firefox Linux support.
  • Porting LibreOffice to GTK3 – We ported LibreOffice to GTK3, which included Wayland support, touch support and HiDPI support. Our team is one of the major contributors to LibreOffice and help the project forward on a lot of fronts.
  • Google Drive integration – We extended the general Google integration in GNOME 3 to include support for Google Drive as we found that a lot of our users where relying on Google Apps at their work.
  • Flatpak – We created Flatpak to lead the way in moving desktop applications into their own namespaces and containers, resolving a lot of long term challenges for desktop applications on Linux. We expect to have new infrastructure in place in Fedora soon to allow Fedora packagers to quickly and easily turn their applications into Flatpaks.
  • Linux Firmware Service – We created the Linux Firmware service to provide a way for Linux users to get easy access to UEFI firmware on their linux system and worked with great vendors such as Dell and Logitech to get them to support it for their devices. Many bugs experienced by Linux users over the years could have been resolved by firmware updates, but with tooling being spotty many Linux users where not even aware that there was fixes available.
  • GNOME Software – We created GNOME Software to give us a proper Software Store on Fedora and extended it over time to include features such as fonts, GStreamer plugins, GNOME Shell extensions and UEFI firmware updates. Today it is the main Store type application used not just by us, but our work has been adopted by other major distributions too.
  • mp3, ac3 and aac support – We have spent a lot of time to be able to bring support for some of the major audio codecs to Fedora like MP3, AC3 and AAC. In the age of streaming supporting codecs is maybe of less importance than it used to be, but there is still a lot of media on peoples computers they need and want access to.
  • Fedora Media Creator – Cross platform media creator making it very easy to create Fedora Workstation install media regardless of if you are on Windows, Mac or Linux. As we move away from optical media offering ISO downloads started feeling more and more outdated, with the media creator we have given a uniform user experience to quickly create your USB install media, especially important for new users coming in from Windows and Mac environments.
  • Captive portal – We added support for captive portals in Network Manager and GNOME 3, ensuring easy access to the internet over public wifi networks. This feature has been with us for a few years now, but it is still a much appreciated addition.
  • HiDPI support – We worked to add support for HiDPI across X, Wayland, GTK3 and GNOME3. We lead the way on HiDPI support under Linux and keep working on various applications to this date to polish up the support.
  • Touch support – We worked to add support for touchscreens across X, Wayland, GTK3 and GNOME3. We spent significant resources enabling this, both on laptop touchscreens, but also to support modern wacom devices.
  • QGNOME Platform – We created the QGNOME Platform to ensure that Qt applications work well under GNOME3 and gives a nice native and integrated feel. So while we ship GNOME as our desktop offering we want Qt applications to work well and feel native. This is an ongoing effort, but for many important applications it already is a great improvement.
  • Nautilus improvements. Nautilus had been undermaintained for quite a while so we had Carlos Soriano spend significant time on reworking major parts of it and adding new features like renaming multiple files at ones, updating the views and in general bring it up to date.
  • Night light support in GNOME – We added support for automatic adjusting the color and light settings on your system based on light sensors found in modern laptops. This integrated functionality that you before had to install extra software like Red Shift to enable.
  • libratbag – We created a library that enable easy configuration of high end mice and other kind of input devices. This has led to increased collaboration with a lot of gaming mice manufacturers to ensure full support for their devices under Linux.
  • RADV – We created a full open source Vulkan implementation for ADM GPUs which recently got certified as Vulkan compliant. We wanted to give open source Vulkan a boost, so we created the RADV project, which now has an active community around it and is being tested with major games.
  • GNOME Shell performance improvements – We been working on various performance improvements to GNOME Shell over the last few years, with significant improvements having happened. We want to push the envelope on this further though and are planning a major performance hackfest around Shell performance and resource usage early next year.
  • GNOME terminal developer improvements – We worked to improve the features of GNOME Terminal to make it an even better tool for developers with items such as easier naming of terminals and notifications for long running jobs.
  • GNOME Builder – Improving the developer story is crucial for us and we been doing a lot of work to make GNOME Builder a great tool for developer to use to both improve the desktop itself, but also development in general.
  • Pipewire – We created a new media server to unify audio, pro-audio and video. First version which we are shipping in Fedora 27 to handle our video capture.
  • Fleet Commander – We launched Fleet Commander our new tool for managing large Linux desktop deployments. This answer a long standing call from many of Red Hats major desktop customers and many admins of large scale linux deployments at Universities and similar for a powerful yet easy to use administration tool for large desktop deployments.

I am sure I missed something, but this is at least a decent list of Fedora Workstation highlights for the last few years. Next onto working on my Fedora Workstation 27 blogpost :)

October 18, 2017

Alberto Ruiz just announced Fleet Commander as production ready! Fleet Commander is our new tool for managing large deployments of Fedora Workstation and RHEL desktop systems. So get our to Albertos Fleet Commander blog post for all the details.

Something went incredibly right, and review feedback poured in last week and I got to merge a lot of code.

My VC5 GL driver’s patches for core Mesa got reviewed (thanks Rob Clark, Adam Jackson, and Emil Velikov), so I got to merge it to Mesa. It’s so nice to finally be able to work in tree instead of on a rebasing branch that breaks most weeks.

My GL_OES_required_internalformat got reviewed by Nicolai Hähnle, so I gave it another test run on the Intel CI farm (thanks, Mark Janes!) and merged. VC4 and VC5 now have proper 5551 texture format support, and VC4 conformance test failures with 565 are fixed.

My GL_MESA_tile_raster_order extension for overlapping blit support on VC4 got merged to Khronos’s git tree. Nicolai reviewed my Mesa implementation of the extension, so I’ve merged it. All that’s left for that is merging the X Server usage of it and pushing it on downstream to Raspbian.

I tested the fast mutex patch series for Mesa, and found a 4.3% (+/- .9%) improvement in 10x10 copywinwin on my Intel hardware. Hopefully this lands soon, since those performance improvements should show up on ARM as well.

On the VC5 front, I fixed VPM setup on actual HW (the simulator’s restrictions didn’t catch one of the HW requirements), getting a lot of tests that do gl_ModelViewProject * gl_Vertex to work. I played around with the new GPU reset code a bit, and it looks like the next step is to implement binner overflow handling.

I’ve been doing some more review feedback with Boris. We’re getting closer to merge on MADVISE, for sure. I respun my DSI transactions fix based on Boris’s feedback, and it’s even nicer now.

Next week: VC5 binner overflow handling, merging MADVISE, and hopefully putting together some Raspbian backports.

October 12, 2017

So I am really happy to announce another major codec addition to Fedora Workstation 27 namely the addition of the codec called AAC. As you might have seen from Tom Callaways announcement this has just been cleared for inclusion in Fedora.

For those not well versed in the arcane lore of audio codecs AAC is the codec used for things like iTunes and is found in a lot of general media files online. AAC stands for Advanced Audio Coding and was created by the MPEG working group as the successor to mp3. Especially due to Apple embracing the format there is a lot of files out there using it and thus we wanted to support it in Fedora too.

What we will be shipping in Fedora is a modified version of the AAC implementation released by Google, which was originally written by Frauenhoffer. On top of that we will of course be providing GStreamer plugins to enable full support for playing and creating AAC files for GStreamer applications.

Be aware though that AAC is a bit of an umbrella term for a lot of different technologies and thus you might be able to come across files that claims to use AAC, but which we can not play back. The most likely reason for that would be that it requires a AAC profile we do not support. The version of AAC that we will be shipping has also be carefully created to fit within the requirements for software in Fedora, so if you are a packager be aware that unlike with for instance mp3, this change does not mean you can package and ship any AAC implementation you want to in Fedora.

I am expecting to have more major codec announcements soon, so stay tuned :)

October 10, 2017

I spent this week in front of the VC5 hardware, working toward implementing GPU reset. I’m going to need reliable reset before I can start running GL testsuites.

Of course, to believe that GPU reset works, I’m going to want some tests that trigger it. I pulled the VC5 XML-based code-generation from Mesa into i-g-t, and built up basic rendering tests using it. It caught that I was misusing struct scatterlist (causing CPU page faults trying to fill the VC5 MMU). I also had mistyped a bit of the XML, calling a bitmask a bool so that the hardware tried to store render targets 1, 2, and 3, instead of 0 (causing GPU hangs). After stabilizing all that, building the hang testcase was pretty simple.

Taking a break from kernel adventures, I did a bit more work on the vc5 GL driver. Transform feedback now has many tests passing, provoking vertex is supported, float32 render targets are supported, MRTs are supported, colormasks are fixed for non-independent blending, a regression in blending with a 0 factor is fixed, and >32bpp clear colors are fixed.

I’ve got a new revision of Boris’s VC4 MADVISE work, and it’s looking good. Boris has also cleaned up some debug messages that have been spamming dmesg on the Raspberry Pi, which is great news for us.

I also spent quite some time reviewing Dylan’s meson work for Mesa. It’s a slog – build systems for a project of Mesa’s scale are huge, but I’ve seen what meson can get you in productivity gains from my conversion of the X Server, and it looks like we should be able to replace all 3(!) of Mesa’s build systems with this one.

Finally, on Friday I got the reviews necessary for the DSI panel driver, and I merged it to drm-misc-next to appear in the 4.15 kernel. We still need to figure out what to do about the devicetree for it (since it’s sort of an optional piece of hardware for the board, but more official than other hardware), but at least this is a lot less for downstreams to carry.

Next week: merging vc5 GL and working on actually performing GPU reset.