• NetBSD
  • Is it possible to run X as root but the wm as normal user?

This would be just a workaround for the current amdgpu situation on NetBSD10 where X seems to work with amdgpu only when logged in as root. I'm not sure though if running X as root is just as risky as doing anything else as root but maybe not. The only thing I tried for now was adding "su -l r0ller" to xinitrc just before calling "exec jwm" but it just hanged the process. Does anyone have other ideas/opinion?

    This is already the default behaviour! The X server runs as root even when you run startx as an unprivileged user.

    • pin replied to this.

      bsduck no, not on NetBSD. The X11 process is owned by the user.

        pin How so? I did check on a NetBSD machine before replying, and I do have a /usr/X11R7/bin/X process running as root.

        • pin replied to this.

          bsduck yes, you do

          ~> ps aux
          USER     PID %CPU %MEM      VSZ    RSS TTY   STAT STARTED    TIME COMMAND
          [...]
          root     919  0.0  0.4   122504  40404 ?     Sl   11:09PM 0:02.79 /usr/X11R7/bin/X :0 -noretro -auth /home/pin/.serverauth.749 
          [...]

          or no, you don't

          What's gonna be?

          Try killing it and you will see you don't need "super-powers" to do so.

          • rvp replied to this.

            pin You're both right: /usr/X11R7/bin/Xorg is setuid on NetBSD; it drops privs. once setup is done.

            r0ller This would be just a workaround for the current amdgpu situation on NetBSD10 where X seems to work with amdgpu only when logged in as root.

            This is possible, but root has explicitly to allow other users to connect to his X server (disallowed by default for security reasons). Try xhost +local: before you switch to the WM (& make sure $DISPLAY is set, etc.)

            But, I don't see why X should work only for root. Can you show the output of (as root):

            fstat -p $(pgrep X)
            ls -l /dev/dri

            Let's see if the devices and its perms. are OK.

              rvp
              ps aux shows:
              root 1770 0,0 0,1 81612 10244 ? Ss 12:25du. 0:00.06 /usr/X11R7/bin/xterm -class UXTerm
              root 1969 0,0 0,4 281672 69976 ? S<l 12:25du. 0:01.24 /usr/X11R7/bin/X :0 -noretro -auth /root/.serverauth.1580

              fstat -p $(pgrep X) says:
              fstat: stat(1770): No such file or directory
              fstat: stat(1969): No such file or directory

              However, fstat -p 1969 shows:
              USER CMD PID FD MOUNT INUM MODE SZ|DV R/W
              root X 1969 wd / 24258560 drwxr-xr-x 1536 r
              root X 1969 0 / 4380680 crw------- constty rw
              root X 1969 1 / 4380680 crw------- constty rw
              root X 1969 2 / 4380680 crw------- constty rw
              root X 1969 3 / 35563701 -rw-r--r-- 25817 w
              root X 1969 4* unix stream <-> /tmp/.X11-unix/X0 [creat]
              root X 1969 5* unix stream <-> /tmp/.X11-unix/X0 [creat]
              root X 1969 6 / 4382449 crw-r----- pci0 rw
              root X 1969 7 / 4382450 crw-r----- pci1 rw
              root X 1969 8 / 4382451 crw-r----- pci2 rw
              root X 1969 9 / 4382452 crw-r----- pci3 rw
              root X 1969 10 / 4382453 crw-r----- pci4 rw
              root X 1969 11 / 4382454 crw-r----- pci5 rw
              root X 1969 12 / 4382455 crw-r----- pci6 rw
              root X 1969 13 / 4382456 crw-r----- pci7 rw
              root X 1969 14 / 4382457 crw-r----- pci8 rw
              root X 1969 15 / 4382458 crw-r----- pci9 rw
              root X 1969 16 / 4382459 crw-r----- pci10 rw
              root X 1969 17 / 4382460 crw-r----- pci11 rw
              root X 1969 18 / 4383208 crw-r----- pci12 rw
              root X 1969 19 / 4383209 crw-r----- pci13 rw
              root X 1969 20 / 4380683 crw-r----- mem rw
              root X 1969 21 / 4380696 crw------- ttyE1 rw
              root X 1969 22* misc ops=0xffffffff815d6f40 0xfffffcc5f8de0e48
              root X 1969 23* misc ops=0xffffffff815d6f40 0xfffffcc5f8de0e48
              root X 1969 24 / 4380695 crw------- ttyE0 rw
              root X 1969 25* misc ops=0xffffffff815d6f40 0xfffffcc5f8de0e48
              root X 1969 26* misc ops=0xffffffff815d6f40 0xfffffcc5f8de0e48
              root X 1969 27* misc ops=0xffffffff815d6f40 0xfffffcc5f8de0e48
              root X 1969 28* misc ops=0xffffffff815d6f40 0xfffffcc5f8de0e48
              root X 1969 29* pipe 0xfffffcc5f7b04700 <- 0xfffffcc5f7a65038 rn
              root X 1969 30* pipe 0xfffffcc5f7a65038 -> 0xfffffcc5f7b04700 w
              root X 1969 31* pipe 0xfffffcc5f7b04320 <- 0xfffffcc5f7a65418 rn
              root X 1969 32* pipe 0xfffffcc5f7a65418 -> 0xfffffcc5f7b04320 w
              root X 1969 33 / 4380743 crw------- wsmux0 rw
              root X 1969 34* unix stream <-> /tmp/.X11-unix/X0 [creat]
              root X 1969 35* unix stream <-> /tmp/.X11-unix/X0 [creat]
              root X 1969 36* unix stream <-> /tmp/.X11-unix/X0 [creat]
              root X 1969 37* unix stream <-> /tmp/.X11-unix/X0 [creat]
              root X 1969 38* unix stream <-> /tmp/.X11-unix/X0 [creat]
              root X 1969 39* unix stream <-> /tmp/.X11-unix/X0 [creat]
              root X 1969 40* unix stream <-> /tmp/.X11-unix/X0 [creat]
              root X 1969 41* unix stream <-> /tmp/.X11-unix/X0 [creat]
              root X 1969 42* unix stream <-> /tmp/.X11-unix/X0 [creat]

              ls -l /dev/dri output is:
              crw-r----- 1 root wheel 180, 0 Jun 6 12:25 card0
              crw-rw---- 1 root wheel 180, 128 Apr 5 23:22 renderD128
              crw-rw---- 1 root wheel 180, 129 Apr 5 23:22 renderD129
              crw-rw---- 1 root wheel 180, 130 Apr 5 23:22 renderD130
              crw-rw---- 1 root wheel 180, 131 Apr 5 23:22 renderD131

              Unfortunately, I don't know if any those are correct or not. Actually I'm starting to think that the problem is not that X can only run as root. Since I only started to experiment with root after X failed to start after using the system with amdgpu for a week or so. But now the same happens when logging in as root: after startx I get some garbage on the screen and the system hangs no matter how many times I restart it. What seems to help, is to boot with the original NetBSD 10 kernel which disables amdgpu by default, startx, then reboot with the same kernel but with amdgpu enabled and then startx works again. No clue what this magic is. Currently, I logged in as root and started x as root. I guess it'd work now as normal user too.

              Concerning xhost, adding it to .xinitrc made the process hang so I added it to the root's .bash_profile like:
              if xhost >& /dev/null ; then xhost + ; su -l r0ller; fi;

              and added to r0ller's .bash_profile:
              if [ -z "${DISPLAY}" ];then export DISPLAY=:0;fi;

              This way, starting uxterm is opened as if r0ller logged in and can start any X app.

              The whole system seems to work pretty strange though:
              1) if I boot normally, amdgpu cannot kick in and hangs the boot process. But when booting into single user first, reboot and then boot as normal user only then it gets to the login (not xdm).
              2) The other strange thing I noticed is that ACPI seems to work correctly when I boot into single user. At least, the poweroff command works only in that case. When booting normally, even if logged in as root, poweroff does not turn off the box only halts and prints something like press any key to reboot.

              • rvp replied to this.

                kc9udx
                Hm, mine looks ok:
                -rws--x--x 1 root wheel 2902344 Mar 28 09:33 /usr/X11R7/bin/Xorg

                r0ller ls -l /dev/dri output is:

                crw-r-----  1 root  wheel  180,   0 Jun  6 12:25 card0

                Well, that explains some things at least: a) wheel should have rw permissions for all the devices in /dev/dri, and b) the other card devices are missing (only matters if you have multiple graphics cards):

                $ ls -l /dev/dri/
                total 0
                crw-rw----  1 root  wheel  180,   0 Feb 15 03:18 card0
                crw-rw----  1 root  wheel  180,   1 Feb 15 03:18 card1
                crw-rw----  1 root  wheel  180,   2 Feb 15 03:18 card2
                crw-rw----  1 root  wheel  180,   3 Feb 15 03:18 card3
                crw-rw----  1 root  wheel  180, 128 Jan 31 04:22 renderD128
                crw-rw----  1 root  wheel  180, 129 Jan 31 04:22 renderD129
                crw-rw----  1 root  wheel  180, 130 Jan 31 04:22 renderD130
                crw-rw----  1 root  wheel  180, 131 Jan 31 04:22 renderD131
                $ 

                This should fix it:

                # cd /dev/ && rm -fv dri/*
                # sh ./MAKEDEV drm0 drm1 drm2 drm3

                r0ller Concerning xhost, adding it to .xinitrc made the process hang so I added it to the root's .bash_profile like:
                if xhost >& /dev/null ; then xhost + ; su -l r0ller; fi;

                Hmm. Do you use startx or xdm? Adding it to .bash_profile is useless if run startx because there's no X server running at that time. Once the server is running, you can do xhost any time (and xhost + is bad--it allows everybody access to the X server; esp bad. if it accepts TCP/IP connections).

                r0ller 1) if I boot normally, amdgpu cannot kick in and hangs the boot process. But when booting into single user first, reboot and then boot as normal user only then it gets to the login (not xdm).

                DRMKMS bug? Missing firmware blobs? Bad RAM? Buggy BIOS? No idea.

                Bad RAM is easiest to check: Memtest86+ or MemTest86

                r0ller 2) The other strange thing I noticed is that ACPI seems to work correctly when I boot into single user. At least, the poweroff command works only in that case. When booting normally, even if logged in as root, poweroff does not turn off the box only halts and prints something like press any key to reboot.

                Again, no clue. Is it the same if you userconf disable amdgpu* or run GENERIC? (Many kinds of ACPI issues can be fixed with the latest firmware. Can you check yours?)

                My advice is to file a PR for the DRMKMS bugs, and just use wsfb until it gets fixed.

                1. Can you post the dmesg output and your Xorg.0.log?
                2. Are you running the modesetting Xorg driver with DRMKMS or the default radeon?

                  r0ller X seems to work with amdgpu only when logged in as root

                  I tried this just to see if it'd also work for me but it does not.

                  rvp My advice is to file a PR for the DRMKMS bugs, and just use wsfb until it gets fixed.

                  I think this is what I'll need to do. Far byond me to file a PR for something like this. I'me here to help test thigs if needed, just throw commands at me.

                  Cheers

                  rvp
                  Sorry, this will be long as I tried to upload /var/log/messages (as it reveals more than dmesg) and Xorg.0.log but I always just get an error so I'll try to copy them at the end of my write-up. (Edit: created gdrive links for them: messages, Xorg.0.log)

                  The messages file contains 4 boot processes:
                  1st: single user mode boot with amdgpu, then reboot
                  2nd: normal user mode boot with amdgpu, login as root, X crashes
                  3rd: single user mode boot with amdgpu, then reboot
                  4th: normal user mode boot with amdgpu, login as root, X running

                  What I noticed is that amdgpu failing due to mutex handling and my hack to solve it (see https://www.unitedbsd.com/d/1052-netbsd-9-10-amdgpu/25) may have to do something with the ACPI error I mentioned. Some parts of the dmesg differ when booting in single user mode or normal user mode.

                  BOOT IN SINGLE USER MODE WITH AMDGPU (at the very begining of boot):

                  Jun 7 09:13:12 r0ller /netbsd: [ 1.0000040] ACPI: RSDP 0x00000000000F05A0 000024 (v02 ALASKA)
                  Jun 7 09:13:12 r0ller /netbsd: [ 1.0000040] ACPI: XSDT 0x00000000DBDAC098 0000B4 (v01 ALASKA A M I 01072009 AMI 00010013)
                  Jun 7 09:13:12 r0ller /netbsd: [ 1.0000040] ACPI: FACP 0x00000000DBDB3C70 000114 (v06 ALASKA A M I 01072009 AMI 00010013)
                  Jun 7 09:13:12 r0ller /netbsd: [ 1.0000040] ACPI: DSDT 0x00000000DBDAC1E8 007A83 (v02 ALASKA A M I 01072009 INTL 20120913)
                  Jun 7 09:13:12 r0ller /netbsd: [ 1.0000040] ACPI: FACS 0x00000000DBE18E00 000040
                  Jun 7 09:13:12 r0ller /netbsd: [ 1.0000040] ACPI: APIC 0x00000000DBDB3D88 00015E (v03 ALASKA A M I 01072009 AMI 00010013)
                  Jun 7 09:13:12 r0ller /netbsd: [ 1.0000040] ACPI: FPDT 0x00000000DBDB3EE8 000044 (v01 ALASKA A M I 01072009 AMI 00010013)
                  Jun 7 09:13:12 r0ller /netbsd: [ 1.0000040] ACPI: FIDT 0x00000000DBDB3F30 00009C (v01 ALASKA A M I 01072009 AMI 00010013)
                  Jun 7 09:13:12 r0ller /netbsd: [ 1.0000040] ACPI: SSDT 0x00000000DBDB3FD0 0000C8 (v02 ALASKA CPUSSDT 01072009 AMI 01072009)
                  Jun 7 09:13:12 r0ller /netbsd: [ 1.0000040] ACPI: SSDT 0x00000000DBDB4098 008C98 (v02 AMD AMD ALIB 00000002 MSFT 04000000)
                  Jun 7 09:13:12 r0ller /netbsd: [ 1.0000040] ACPI: SSDT 0x00000000DBDBCD30 003676 (v01 AMD AMD AOD 00000001 INTL 20120913)
                  Jun 7 09:13:12 r0ller /netbsd: [ 1.0000040] ACPI: MCFG 0x00000000DBDC03A8 00003C (v01 ALASKA A M I 01072009 MSFT 00010013)
                  Jun 7 09:13:12 r0ller /netbsd: [ 1.0000040] ACPI: HPET 0x00000000DBDC03E8 000038 (v01 ALASKA A M I 01072009 AMI 00000005)
                  Jun 7 09:13:12 r0ller /netbsd: [ 1.0000040] ACPI: UEFI 0x00000000DBDC0420 000042 (v01 ALASKA A M I 00000002 01000013)
                  Jun 7 09:13:12 r0ller /netbsd: [ 1.0000040] ACPI: IVRS 0x00000000DBDC0468 0000D0 (v02 AMD AMD IVRS 00000001 AMD 00000000)
                  Jun 7 09:13:12 r0ller /netbsd: [ 1.0000040] ACPI: PCCT 0x00000000DBDC0538 00006E (v01 AMD AMD PCCT 00000001 AMD 00000000)
                  Jun 7 09:13:12 r0ller /netbsd: [ 1.0000040] ACPI: SSDT 0x00000000DBDC05A8 002F29 (v01 AMD AMD CPU 00000001 AMD 00000001)
                  Jun 7 09:13:12 r0ller /netbsd: [ 1.0000040] ACPI: CRAT 0x00000000DBDC34D8 000B58 (v01 AMD AMD CRAT 00000001 AMD 00000001)
                  Jun 7 09:13:12 r0ller /netbsd: [ 1.0000040] ACPI: CDIT 0x00000000DBDC4030 000029 (v01 AMD AMD CDIT 00000001 AMD 00000001)
                  Jun 7 09:13:12 r0ller /netbsd: [ 1.0000040] ACPI: SSDT 0x00000000DBDC4060 001D4A (v01 AMD AmdTable 00000001 INTL 20120913)
                  Jun 7 09:13:12 r0ller /netbsd: [ 1.0000040] ACPI: SSDT 0x00000000DBDC5DB0 0000BF (v01 AMD AMD PT 00001000 INTL 20120913)
                  Jun 7 09:13:12 r0ller /netbsd: [ 1.0000040] ACPI: WSMT 0x00000000DBDC5E70 000028 (v01 ALASKA A M I 01072009 AMI 00010013)
                  Jun 7 09:13:12 r0ller /netbsd: [ 1.0000040] ACPI: 7 ACPI AML tables successfully acquired and loaded

                  BOOT IN NORMAL USER MODE WITH AMDGPU (at the very begining of boot):

                  Jun 7 09:13:12 r0ller /netbsd: [ 1.0000040] ACPI Error: AE_BAD_PARAMETER, Thread 2175328192 could not acquire Mutex [ACPI_MTX_Tables] (0x2) (20221020/utmutex-326)

                  That ACPI error complains about mutexes and the error that I "fixed" earlier by hacking mutex handling to be able to boot in normal user mode with amdgpu was:
                  panic: unlocking unlocked wait/wound mutex: 0xffff858023c90920

                  Now that panic does not appear on boot, but probably my hack leads to this mysterious behaviour of X sometimes working and sometimes not (regardless of being root or not) with amdgpu. The interesting question is: why does the ACPI error show up only when booting in normal user mode and never when booting in single user mode?

                  I'll now give a try to your suggestion about changing the permissions of /dev/dri but until now I was just busy with booting the system and describing in this post what happened 🙂

                  • rvp replied to this.

                    Concerning /dev/dri, is it just enough to change the permission of card0 or shall I really delete everything (including renderD* files) in /dev/dri and use makedev? If the latter, I guess it's enough to issue just MAKEDEV drm0 if I have one card. By the way, will it recreate renderD* files as well?

                    • rvp replied to this.

                      r0ller Concerning /dev/dri, is it just enough to change the permission of card0

                      That should be enough.

                      r0ller it's enough to issue just MAKEDEV drm0 if I have one card.

                      Yes.

                      r0ller By the way, will it recreate renderD* files as well?

                      Yes.

                      r0ller The messages file contains 4 boot processes:
                      1st: single user mode boot with amdgpu, then reboot
                      2nd: normal user mode boot with amdgpu, login as root, X crashes
                      3rd: single user mode boot with amdgpu, then reboot
                      4th: normal user mode boot with amdgpu, login as root, X running
                      [...]
                      The interesting question is: why does the ACPI error show up only when booting in normal user mode and never when booting in single user mode?

                      I don't get this either. From your messages I see that in single-user mode, the amdgpu drivers are not active. I don't see how that can happen if the drivers are built-in. Are you loading these as modules for the normal boot and not for the single-user ones?

                      BTW, syslogd is never run in single-user mode, so nothing will ever be written to /var/log/messages in that case. If you happen to reboot from single-user to multi-user, then the previous kernel logs in memory may get preserved and get added to the logs on the subsequent boots.

                      Better to explicitly collect logs while in single-user mode:

                      mount -u -w /
                      dmesg > /root/dmesg.0.txt
                      mount -u -r /
                      poweroff                  # ensure memory is cleared.

                      Can you tell me if the ACPI errors show up without your patch? (Ie. GENERIC with just the amdgpu drivers compiled in.)

                      EDIT: Here's root's ~/.xinitrc, which I used while testing this stuff. With this, if I do startx as root, I get rvp's X session (as if the user typed startx on the command-line).

                      #!/bin/sh
                      
                      xhost +local:
                      exec su -l rvp -c '
                          export DISPLAY=:0
                          export PATH=$HOME/bin:/usr/X11R7/bin:$PATH
                          . ~/.xinitrc
                      '

                        rvp

                        Recompiled the kernel without my hack but with amdgpu still enabled and my findings are as follows:

                        1) The ACPI error is also reported without my hack when booting in normal user mode but not in single user mode.
                        2) Starting X in single user mode as root (after mount -a) crashes with unlocking unlocked mutex (remember, my hack is removed). So the ACPI mutex error reported when booting in normal user mode does not seem to have anything to do with that since in single user mode there are no ACPI errors reported.

                        Concerning xorg.conf, I haven't generated any, I just let the system use the builtin configuration. Shall I generate one as described in https://www.netbsd.org/docs/guide/en/chap-x.html#chap-x-configuration and modify it to use the modesetting driver or amdgpu or anything else?

                        • rvp replied to this.

                          r0ller 1) The ACPI error is also reported without my hack when booting in normal user mode but not in single user mode.

                          Don't see why this should happen. The only difference between single-user and multi-user mode is that in single-user mode, the kernel run init with the -s flag. Might be a warm-boot/cold-boot difference instead. Is it still the same after a physical power-off between SU/MU?

                          r0ller 2) Starting X in single user mode as root (after mount -a) crashes with unlocking unlocked mutex

                          I would try, in order:

                          1. Updating BIOS (even if there's no newer BIOS, just reflash the current one--that sometimes clears out EFI cruft).

                          2. Do a memtest.

                          3. File a PR. If this unlocking unlocked mutex error happens only when you exit Xorg (as seems to be the case with @pfr), then the cause should pretty easily be found and fixed by people who know this code.

                          r0ller nd modify it to use the modesetting driver

                          Yes. Please try this too.

                            rvp
                            Didn't have much time so I only tried modesetting in xorg.conf as normal user but it crashed the system. I also added a dri section which I don't know if it's necessary or not:

                            Section "DRI"
                                Group "wheel"
                                Mode 0660
                            EndSection

                            I'll play around a bit more with xorg.conf. By the way, generating xorg.conf ended up with an error message something like "number of created screens does not match number of detected devices" but it seemed complete in the end so I edited and used that. The strange thing is that even though I have one card (R9 nano) but two were detected (card0 and card1).

                            • rvp replied to this.
                              5 days later

                              r0ller Didn't have much time so I only tried modesetting in xorg.conf as normal user but it crashed the system.

                              This is, by definition, a bug in the kernel: no userspace program should cause a kernel to crash.

                              r0ller I also added a dri section which I don't know if it's necessary or not

                              Not needed, I think, but does no harm.

                              r0ller By the way, generating xorg.conf [...]

                              Don't need to generate a full xorg.conf these days.