Hardware? Do they shut down properly if you do it from the console or ssh?
Linux
From Wikipedia, the free encyclopedia
Linux is a family of open source Unix-like operating systems based on the Linux kernel, an operating system kernel first released on September 17, 1991 by Linus Torvalds. Linux is typically packaged in a Linux distribution (or distro for short).
Distributions include the Linux kernel and supporting system software and libraries, many of which are provided by the GNU Project. Many Linux distributions use the word "Linux" in their name, but the Free Software Foundation uses the name GNU/Linux to emphasize the importance of GNU software, causing some controversy.
Rules
- Posts must be relevant to operating systems running the Linux kernel. GNU/Linux or otherwise.
- No misinformation
- No NSFW content
- No hate speech, bigotry, etc
Related Communities
Community icon by Alpár-Etele Méder, licensed under CC BY 3.0
I will try to restart it using the reboot
command.
The computer consists of:
- i3-10100
- 16gb ddr4 ram
- MSI h410m pro
Interesting. BIOS update? Maybe check through all the settings, or do a factory reset on the BIOS? I have a similar board (H510 something) running proxmox and it works fine.
I would like to note that this may have been caused by a bios update, as it started sometime after it. i'll try another update now.
edit: already on the latest bios version.
reboot: machine restart
This makes me think it's a motherboard issue.
The system is done with its shutdown process and issued the reboot command, but the motherboard didn't restart.
There could be some electronics components which get wedged over time. My sound card will occasionally not boot unless it has been completely powered off for 30 seconds or so.
@potentiallynotfelix As a diagnostic, I would suggest trying shutting them down by ssh in and then using systemctl to shut them down, if that works then you know the issue is with cockpit. If it hangs even when systemd is asked to halt then I would consider reverting to the previous bios and see if the problem persists.
Ok. Cockpit uses the shutdown command to shut down[src], but systemctl poweroff might work. I will also attempt to revert bioses if msi supports it. thank you very much!
@potentiallynotfelix Ok bet of luck. I've had all sorts of weird issues with systemd as of late, but not sure how many of those are inherent to systemd itself and how many are Ubuntu's config of same.
sudo systemctl reboot
did the same. I'm starting to think this is bios related.
@potentiallynotfelix Well flash to an older and see how it goes. I've seen some wired bios issues. I've got an i7=6850k machine on an Asus motherboard, and after I flashed to the latest bios, the USB power strobed on and off every few seconds so keyboard and mouse would work then not work then work then not work. I thought something was broken with hardware but then found others had the same issue with the most current BIOS, flashed to one release earlier and all good.
Flashing an older bios seemed to succeed! I gave it 14 hours or so before attempting a reboot, and if seemed to reboot without stalling. I'll give it a few more days now and try another, but that seemed to have fixed it.
Is it actual server hardware? I've seen some very weird things with real servers that take ages to reboot (I was assuming it was self checking or something). Are you sure its hung, and not just very slow to shutdown/reboot?
Is there any serial/monitor output before the hang?
Monitor output after shutting down:
I've given it 6 hours or so to shut down, so it's almost 100% a hang not a slow shutdown
I had this issue: failed to finalise remaining DM devices. Which led me to here https://github.com/systemd/systemd/issues/15004 and Skinner927 mentions your issue in that thread
I'd try uninstalling nouveau completely and see if the issue persists for you
The xserver-xorg-video-nouveau
package was not installed, how else would I remove nouveau?
that's only the X11 "driver" for it. nouveau is built into the kernel, the way to "uninstall" it is to make it not get loaded, by blacklisting it
https://wiki.archlinux.org/title/Nouveau
but this does not seem to be the problem
Agreed, lsmod | grep nouveau
returns nothing, so I'm not concerned about nouveau or nvidia being the issue here.
I don't have an Nvidia GPU so I don't have any experience with it but a quick search brought me to Nvidias website and the instructions seem to line up with users answers on other forums.
Disable it here https://docs.nvidia.com/ai-enterprise/deployment/vmware/latest/nouveau.html or apparently installing Nvidias proprietary drivers automatically blacklists Nouveau.
lsmod | grep nouveau
returns nothing, so I assume removing my gpu automatically stopped it from being loaded. that sorta rules out nouveau as an issue.
Direct link to skinners comment: https://github.com/systemd/systemd/issues/15004#issuecomment-2264687287
seems its a nvidia issue, i also have that issue, the gpu locks and i need to reboot while the VM with the nvidia passthrough freezes. i need a full reboot from baremetal machine to stop gpu using all his power stuck, don't let it be for hours being on or you will kill your hardware
I have removed my gpu and the issue is still present.
yes sorry I read you after writting it, if you remove the GPU the log message is the same but without the GPU lockup line?
Yeah that seems like a mainboard issue.
I've run arch linux for a year or so before converting it, and no issues with shutdown. what makes you think that's the cause?
Because you tried two different OSes and the point where it hangs is the point where the OS sends an APM/ACPI command to reboot / power off. This is the last thing the OS does. So if that's not happening something is wrong with the hardware, BIOS, or BIOS settings.
You could try the syslog (journalctl), but logging is probably already off at that point.
yeah journalctl logs show nothing relevant. I have disabled acpi and forced it(acpi=force
), but that didn't fix this. There are a lot of different combinations of acpi settings I could try:
acpi=force noapic
nolapic
noapic
acpi_osi=“Linux”
acpi_osi=“Windows 2006”
acpi=ht
pci=noacpi
acpi=noirq
pnpacpi=off
But I found these from a guy which they didn't work on so I'm reluctant to try them.
did you check it /proc/cmdline if the params were taken into account? perhaps you edited the config but didn't update the initramfs
Yes, I've always made sure to use update-grub
and checked cmdline to make sure it has the correct parameters. Regardless of acpi=force or acpi=off, it would still hang.
And I guess if you're in front of the computer, you could just press the reset button or unplug it at that point (after it sucessfully synchronized the disks). no need to let it sit, there is no harm or data to be lost at that point.
that is what I end up doing right now, but if I'm on vacation and I need to reboot, I'm fucked.
do you know that use device mapper? what kind of device is /dev/dm-1 ?
"dmsetup info" might help
sudo dmsetup info
returns:
Name: raven--vg-root
State: ACTIVE
Read Ahead: 256
Tables present: LIVE
Open count: 1
Event number: 0
Major, minor: 254, 0
Number of targets: 1
UUID: LVM-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
Name: raven--vg-swap_1
State: ACTIVE
Read Ahead: 256
Tables present: LIVE
Open count: 2
Event number: 0
Major, minor: 254, 1
Number of targets: 1
UUID: LVM-xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
did you make these yourself? if not, could you cdo an ls -l /dev/mapper
? it shows which name corresponds to which dm device
Says reboot, are you issuing a reboot or a shutdown poweroff? Entering sleep state 5 shout be power off right?
I click the reboot button on cockpit, which issues a shutdown --reboot
command as root. I agree that sleep state S5 is powered off. From the acpi docs:
A computer state where the computer consumes a minimal amount of power. No user mode or system mode code is run. This state requires a large latency in order to return to the Working state. The system’s context will not be preserved by the hardware. The system must be restarted to return to the Working state. It is not safe to disassemble the machine in this state.
This likely means my system is failing to reach that s5/g2 state.
If you ssh login directly and issue same command, not In cockpit interface, does it react the same?
no, sorry for not specifying. it's scrapped together from old consumer components.
- i3-10100
- 16gb ddr4 ram
- MSI h410m pro
halt -p
thanks for the suggestion, could you elaborate on what this would do differently from the regular shutdown command that systemctl uses? thanks again
My understanding is that 'halt' had been an alias for 'halt -p', but that changed recently. -p tells the command to power off. Without it, it just shuts down process.
halt -p
did nothing different. still hung on shutdown.
Your machine isn't shutting down, it's trying to sleep.
You also have active KVM instances which are fighting to keep it alive.
can you elaborate on why you suspect this? The cockpit reboot or shutdown button uses the shutdown
command directly along with a --reboot
or --poweroff
flag.
onSubmit(event) {
const Dialogs = this.context;
const arg = this.props.shutdown ? "--poweroff" : "--reboot";
if (!this.props.shutdown)
cockpit.hint("restart");
cockpit.spawn(["shutdown", arg, this.state.when, this.state.message], { superuser: "require", err: "message" })
.then(this.props.onClose || Dialogs.close)
.catch(e => this.setState({ error: e.toString() }));
event.preventDefault();
return false;
}
(source)