(In response to comment #2 by Günter Roeck)
> You must set the temperature limits correctly. No limits, the chips
> Alarms are constantly being generated which is the likely cause of this
> interrupt.
>
> However, this does not solve the completion interrupt timeouts. Could be
> another problem.
Hallo!
Thanks for your answer. I have tried to adjust these limits so that the sensors no longer show an ALARM. It doesn't seem to be the cause, because after configuration, the interrupts are still being generated en masse.
jc42-i2c-1-1b
Adapter: I801 SMBus adapter to e000
RAM: +30,0°C (bajo = +0,0°C)
jc42-i2c-1-19
Adapter: I801 SMBus adapter to e000
RAM: +32,0°C (bajo = +0,0°C)
jc42-i2c-1-1a
Adapter: I801 SMBus adapter to e000
RAM: +31,0°C (bajo = +0,0°C)
jc42-i2c-1-18
Adapter: I801 SMBus adapter to e000
RAM: +28,0°C (bajo = +0,0°C)
Health
Konrad
(In response to comment #4 by Günter Roeck)
> Strange, especially since chips in the shouldn't generate interrupts
> first, unless explicitly enabled (which the controller doesn't, or
> at least it shouldn't). My wild guess is that removing the chips out
> The shutdown mode activates the interrupt for some reason.
>
> Can you send the output of "i2cdump -y -f 1 0x18 w"?
Here we go:
╭─root@Galactica ~
╰─➤ i2cdump -y -f 1 0x18 w
0,8 1,9 2,a 3,b 4,c 5,d 6,e 7,f
00: ef00 0000 0005 0000 0005 c801 1f00 0182
08: 0000 0000 0000 0000 0000 0000 0000 0000
10: 0000 0000 0000 0000 0000 0000 0000 0000
18: 0000 0000 0000 0000 0000 0000 0000 0000
20: 0000 0000 0000 0000 0000 0000 0000 0000
28: 0000 0000 0000 0000 0000 0000 0000 0000
30: 0000 0000 0000 0000 0000 0000 0000 0000
38: 0000 0000 0000 0000 0000 0000 0000 0000
40: 0000 0000 0000 0000 0000 0000 0000 0000
48: 0000 0000 0000 0000 0000 0000 0000 0000
50: 0000 0000 0000 0000 0000 0000 0000 0000
58: 0000 0000 0000 0000 0000 0000 0000 0000
60: 0000 0000 0000 0000 0000 0000 0000 0000
68: 0000 0000 0000 0000 0000 0000 0000 0000
70: 0000 0000 0000 0000 0000 0000 0000 0000
78: 0000 0000 0000 0000 0000 0000 0000 0000
80: 0000 0000 0000 0000 0000 0000 0000 0000
88: 0000 0000 0000 0000 0000 0000 0000 0000
90: 0000 0000 0000 0000 0000 0000 0000 0000
98: 0000 0000 0000 0000 0000 0000 0000 0000
a0: 0000 0000 0000 0000 0000 0000 0000 0000
a8: 0000 0000 0000 0000 0000 0000 0000 0000
b0: 0000 0000 0000 0000 0000 0000 0000 0000
b8: 0000 0000 0000 0000 0000 0000 0000 0000
c0: 0000 0000 0000 0000 0000 0000 0000 0000
c8: 0000 0000 0000 0000 0000 0000 0000 0000
d0: 0000 0000 0000 0000 0000 0000 0000 0000
d8: 0000 0000 0000 0000 0000 0000 0000 0000
e0: 0000 0000 0000 0000 0000 0000 0000 0000
e8: 0000 0000 0000 0000 0000 0000 0000 0000
f0: 0000 0000 0000 0000 0000 0000 0000 0000
f8: 0000 0000 0000 0000 0000 0000 0000 0000
Also, do the interrupts stop when you download the driver?
No, they don't stop until I do a hard reset of the server.
This is an Atmel AT30TS00. According to the configuration register, events are disabled and no events are pending, which means that it really shouldn't be the JC42s generating the interrupts.
Another question: if you only load the i801 module after boot (i.e. you prevent the jc42 module from loading by blacklisting it for example, but still load the i801 module) you always get still the interrupts?
Gracias,
Gunter
(In response to comment #8 by Günter Roeck)
Another question: if it only loads the i801 module after boot (i.e. avoids
The jc42 module loads, for example by being blacklisted, but still loads the i801
> module), are you still getting the interrupts?
That's my current situation ;-) jc42 is just a module which is currently not loaded at boot and i801 is compiled into my kernel. In such a case, null interrupts are generated on i801_smbus.
Health
Konrad
(In response to comment #10 by Günter Roeck)
> #7 indicates a problem with the i801 driver and its interrupt handling. #9
> contradicts that somewhat.
>
> Possibly the C2000 has problems with interrupts, or implements it differently
> which is handled by the driver. This can be triggered by a real access in the
> buses. You can try to confirm this by running the i2cdump command afterwards
> boot with no jc42 module loaded (i2cdetect -y 1 should not show up
> reserved addresses) and check if interrupts occur.
>
> Gracias,
> Gunter
They do ;-) Immediately after running "i2cdump -y -f 1 0x18 w" the interrupts start massive. But jc42 was not loaded.
Health
Konrad
I'm sorry but I don't know what do you mean by reserved here?
Before/after running i2cdump (output is the same):
╭─root@Galactica ~
╰─➤ i2cdetect -y 1
0 1 2 3 4 5 6 7 8 9 a b c d e f
00: -- -- -- -- -- 08 -- -- -- -- -- -- --
10: -- -- -- -- -- -- -- -- 18 19 1a 1b -- -- -- --
20: -- -- -- -- -- -- -- -- -- -- -- -- -- -- 2e --
30: 30 31 32 33 -- -- -- -- -- -- -- -- -- -- -- --
40: -- -- -- -- -- -- -- -- -- 49 -- -- -- -- -- --
50: 50 51 52 53 -- -- -- -- -- -- -- -- -- -- -- --
60: -- 61 -- -- -- -- -- -- -- 69 -- -- 6c -- -- --
70: -- -- -- -- -- -- -- --
A simple "i2cdetect -y 1" also enables interrupts.
(In response to comment #13 by Günter Roeck)
> By "reserved" I meant "a driver for a chip is loaded". After loading the
> jc42 driver (or the eeprom driver), you will see some of the addresses
> appears as "UU".
Oh I see. Yes, after loading jc42 I can see "UU".
╭─root@Galactica ~
╰─➤ i2cdetect -y 1
0 1 2 3 4 5 6 7 8 9 a b c d e f
00: -- -- -- -- -- 08 -- -- -- -- -- -- --
10: -- -- -- -- -- -- -- -- --
20: -- -- -- -- -- -- -- -- -- -- -- -- -- -- 2e --
30: 30 31 32 33 -- -- -- -- -- -- -- -- -- -- -- --
40: -- -- -- -- -- -- -- -- -- 49 -- -- -- -- -- --
50: 50 51 52 53 -- -- -- -- -- -- -- -- -- -- -- --
60: -- 61 -- -- -- -- -- -- -- 69 -- -- 6c -- -- --
70: -- -- -- -- -- -- -- --
> Anyway, I think the bottom line is that the i801 driver has problems with this
> Discontinue support for your hardware as I suggested in #10. The number is 177291
> actually the same problem. Jean also claims this driver, so it should be
> can help.
Should I close #177291 as a duplicate since it's my ticket?
Thank you for your support. I hope Jean has an idea :)
Thank you Günter for intervening. I always suspected that the problem is in the SMBus driver (i2c-i801 driver) and wanted to comment for a long time but then forgot, sorry :-(
Conrad, I need detailed information about the PCI SMBus devices and IRQs on your machine. Append the output of:
$ /sbin/lspci-nn | grep SMBus
$ /sbin/lspci -xxx -s <device>
(for each device above)
$ cat /proc/interrupts
Also look in the kernel logs for messages related to i2c, SMBus, i801, or earlier PCI devices.
Download full text(3.3 KiB)
Hello Jean!
(In response to Jean Delvare's comment #16)
> $ /sbin/lspci-nn | grep SMBus
00:13.0 Systemperipherie [0880]: Intel Corporation Atom Processor C2000 SMBus 2.0 [8086:1f15] (Rev. 02)
00:1f.3 SMBus [0c05]: Prozessor Intel Corporation Atom C2000 PCU SMBus [8086:1f3c] (Rev. 02)
> $ /sbin/lspci -xxx -s <device>
> (for each device listed above
╭─root@Galactica /home/kostecki
╰─➤ lspci-xxx-s 00:13.0
00:13.0 Systemperipherie: Intel Corporation Atom Processor C2000 SMBus 2.0 (Rev. 02)
00: 86 80 15 1f 46 05 10 00 02 00 80 08 00 00 00 00
10: 04 40 f1 sig 0f 00 00 00 00 00 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 d9 15 20 08
30: 00 00 00 00 40 00 00 00 00 00 00 00 ab 01 00 00
40: 10 80 92 00 01 80 00 10 20 08 04 00 00 00 00 00
50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
60: 00 00 00 00 10 00 00 00 00 00 00 00 00 00 00 00
70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
80: 01 8c 03 00 00 00 00 00 00 00 00 00 05 00 81 01
90: 0c f0 ef fe 00 00 00 00 a6 41 00 00 00 00 00 00
a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
e0: 00 00 00 00 00 00 00 00 00 00 01 00 10 00 10 80
f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
╭─root@Galactica /home/kostecki
╰─➤ lspci -xxx -s 00:1f.3
00:1f.3 SMBus: Prozessor Intel Corporation Atom C2000 PCU SMBus (Rev. 02)
00: 86 80 3c 1f 43 01 98 02 02 00 05 0c 00 00 00 00
10: 00 00 50 df 00 00 00 00 00 00 00 00 00 00 00 00
20:01 e0 00 00 00 00 00 00 00 00 00 00 d9 15 20 08
30: 00 00 00 00 00 00 00 00 00 00 00 00 ab 02 00 00
40: 11 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
60: 03 04 04 00 00 00 08 08 00 00 00 00 00 00 00 00
70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
80: 04 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
90: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
f0: 00 00 00 00 00 00 00 00 00 0f 02 01 03 03 03 00
> $ cat /proc/interrupts
See attached file.
> Also look for messages about i2c, SMBus, i801 or PCI devices
> at the top of the kernel logs.
╭─root@Galactica /
╰─➤ dmesg|grep -y smbus
[ 7.968653] i801_smbus 0000:00:1f.3: Activate device (0140 -> 0143)
[7.970338] i801_smbus 0000:00:1f.3: SMBus uses PCI interrupt
[ 7.974068] ismt_smbus 0000:00:13.0: enabling device (0140 -> 0142)
[ 974.471917 ] ismt_smbus 0000:00:13.0: Completion timed out
[ 975.512022 ] ismt_smbus 0000:00:13.0: Completion timed out
[ 976.552097] ismt_smbus 0000:00:13.0: Completion timed out
[ 977.592124 ] ismt_smbus 0000:00:13.0: Completion timed out
[ 978.632168] ismt_smbus 0000:00:13.0: Completion timed out
[ 979.682207 ] ismt_smbus 0000:00:13.0: Completion timed out
[ 980.712251] ismt_smbus 0000:00:13.0: Completion timed out
[981.752310] ismt_smbus 0000:00:13...
Continue reading...
Good thank you. I added people from Intel to Cc. I can't find any register descriptions for the Atom C2000's SMBus function, so there's not much I can do.
Conrad, SMBus support for this CPU family was added to the i2c-i801 driver a few years ago, so I'm wondering why this bug is only now being reported.
Is this new hardware for you? Or have you had it for a while and it was working fine until now and it broke with a kernel or OS update?
Jarkko, I found the same document but it doesn't seem to have any registry definitions or I'm blind.
(In response to Jean Delvare's comment #25)
> Jarkko, I found the same document but it doesn't seem to contain it
> Register definitions or am I blind.
Maybe chapter 15.8 and 18.5? I'm sorry if this is wrong as if I don't know if so what are you looking for?
The problem is that only the registry addresses are provided, not the registry definitions. Sure there is a status register and we know its address, but we don't know how the bits are defined and if they are defined exactly like in other Intel CPUs.
Since the C2000 has a different microarchitecture than "mainstream" Intel CPUs, there is a real possibility that the register definitions are different.
Conrad, until we resolve the issue, you may be able to resolve the issue by passing the disabled_ option
Jarkko, could you get a datasheet? It doesn't have to be public if you can review the registry definitions for us.
Hmm... It looks like this one got abandoned somehow. Jarkko, any news on this? Same question for Conrad, have you had any luck with kernels based on version 5.11 (or closer to the latest)?
- dmesg.tgz To edit(150.4 KiB, Application/x-tar)
Hallo Kascardo.
Here you can find dmesg.tgz with:
dmesg/dmesg-
dmesg/dmesg-
dmesg/dmesg-
dmesg/dmesg-
dmesg-normal.txt is a full dmesg when the computer is running fine.
dmesg not responding
dmesg-2021-
When this happens, nothing is working properly. I even implemented a small script to restart the laptop in this case.
We are a school with hundreds of desktops and laptops running Ubuntu 20.04 with no issues. But we have received many of these Lenovo laptops that do not work well with Ubuntu 20.04 or 21.04.
I don't know if I can help you, but with Fedora 34 the laptop works fine.
Thank you for your attention.
This error gives me an idea to try MSI on i801 but it seems that no platform on this device is MSI capable. I'm not sure if it's useful information, but I think it's best to share anyway.
This error is missing log files that help diagnose the problem. When running an Ubuntu kernel (not a main or third-party kernel), enter the following command in a terminal window:
collect passport 1931001
and then change the error status to "Acknowledged".
If you are unable to run this command due to the nature of the problem you are encountering, please add a comment to indicate this and change the error status to "Acknowledged".
This change was made by an automated script maintained by the Ubuntu Kernel team.
1. Describe the problem:
Fedora 34 is completely unusable on an Acer Aspire 1 A114-32-P9MN (laptop) which probably doesn't have a high quality BIOS implementation but works fine with Debian 11, Oracle Linux 8.4 etc. It only has problems with Fedora 34. The machine keeps reporting "soft crash" and something about a watchdog that I don't actually know about. "Soft crashing" occurs in many different modules and contexts (as evidenced by the large number of stack traces in the system logs). Booting from a Fedora 34 Live USB often took over 30 minutes to show the initial desktop, while Oracle Linux boots in about 10 seconds and never causes a "soft crash". I tried dozens of configuration settings to fix the problem, but nothing improved. Considering the many user reports mentioning "soft crash" in Fedora 34, it seems to me that something is seriously wrong with the build. At the moment I've switched to Oracle Linux and will no longer install Fedora on any machine.
2. What is the kernel version number?
5.13.4-
3. Has it worked with Fedora before? If so, which kernel version caused the problem?
appear *first*? Older kernels are available for download at
https:/
I haven't encountered any issues with Fedora 32. After upgrading to Fedora 34 (5.13.4-
4. Can you reproduce this problem? If yes, please provide steps for reproduction
the problem below:
Install Fedora 34 on an Acer Aspire 1 A114-32-P9MN or simply boot from a live USB. It will hang with smooth blocks.
5. Does the latest Rawhide kernel have this problem? To install the
Rawhide kernel, run ``sudo dnf install fedora-
``sudo dnf update --enablerepo=
Sorry, I switched to Oracle Linux and am in the process of migrating all my machines. Fedora is not an option when you have such severe problems with simple hardware.
6. Are you running any modules not shipped directly with the Fedora kernel?:
No, just a simple live USB triggers the problem in its maximum severity.
7. Attach the kernel logs. You can get the full kernel log
for a boot with ``journalctl --no-hostname -k > dmesg.txt``. If he
A previous boot encountered a problem, use the journalctl flag ``-b``.
Sorry, system wiped for an Oracle Linux install, which works fine. But in any case, the Fedora 34 machine was so unresponsive that it would have been almost impossible to get the logs, even just copying them to a USB drive. Under Fedora 34, the machine is completely paralyzed.
- dmesg, lshw, lspci, syslog, Screenshot oben To edit(1,3 MiB, App/PLZ)
Same after upgrading from 5.8 to 5.11.
Slight crashes on boot, long boot time and very slow machine after that.
My machine is also Intel Celeron based, as per previous reports.
After logging into the desktop environment, the "atop" program shows that almost all CPU time is spent on irq, where it is usually close to 0 percent. (See attached file)
Login was difficult because not all keystrokes were processed.
The "old" Ubuntu still works as expected (see attached file).
See attached file for records.
I tried the following:
* Ubuntu live image on USB: same problem
* Fedora Live image on USB: same problem
* Wait for the boot process to complete and collect the logs (kernel parameters: nomodeset debug verbose)
* Run an apt-get update; apt-get update, reboot, problem still there
* fsck, everything was fine.
* I tried the intel_idle kernel parameter.
* Tried the noapic kernel parameter (on a guess), same problem
I also tried:
* kernel parameter watchdog_thresh=20, same problem
* Fast Boot BIOS setting = disabled (was enabled), same problem
Download full text(3.4 KiB)
Possible solution (no solution), i2c_i801 blacklist module. This works for me...
Noticing that a lot of CPU time was being spent handling interrupts, I looked at /proc/interrupts (right after the slow startup and login):
$ cat /proc/interrupts
CPU0 CPU1
0: 9 0 IR-IO-APIC 2-Flanken-Timer
1: 0 249 IR-IO-APIC 1 Bord i8042
8: 1 0 IR-IO-APIC rtc0 de 8 Bit
9: 0 1017 IR-IO-APIC 9-fasteoi acpi
14: 0 591 IR-IO-APIC 14-fasteoi INT3453:00, INT3453:01, INT3453:03
15: 0 0 IR-IO-APIC 15-fasteoi INT3453:02
20: 190734634 0 IR-IO-APIC 20-fasteoi i801_smbus
31: 8350 0 IR-IO-APIC 31-fasteoi idma64.0, i2c_designware.0
39: 0 84628 IR-IO-APIC 39-fasteoi mmc0
120: 0 0 DMAR-MSI 0-borde dmar0
121: 0 0 DMAR-MSI 1-edge dmar1
122:0 0 IR-PCI-MSI 311296-PCIe Edge PME
123:0 0 IR-PCI-MSI 315392-PCIe Edge PME
124:0 0 IR-PCI-MSI 317440-PCIe Edge PME
125: 0 0 IR-PCI-MSI 294912-borde ahci[0000:00:12.0]
126: 0 3 IR-PCI-MSI 1048576-borde rtsx_pci
127: 4171 0 IR-PCI-MSI 344064-borde xhci_hcd
128:0 296 INT3453:00 18 ELAN0503:00
129:0 0 IR PCI MSI 1050624 should be enp2s0f1
130:0 44 IR-PCI-MSI 245760-Edge mei_me
131: 18279 0 IR-PCI-MSI 1572864 borde ath10k_pci
132: 0 669 IR-PCI-MSI 229376-edge snd_hda_intel:card0
NMI: 690 49 non-maskable interrupts
LOC: 693366 704015 Local timer interrupts
SPU: 0 0 nuisance interrupts
PMI: 690 49 performance monitor breaks
IWI: 31340 91937 IRQ interrupts job
RTR: 0 0 APIC ICR-Lesewiederholungen
RES: 23071 21772 Reprogramming interrupts
CAL: 10091 3666 breaks function call
TLB: 2750 4570 TLB-Kills
TRM: 0 0 thermal event interrupts
THR: 0 0 APIC interrupt threshold
DFR: 0 0 Delayed error APIC interrupt
MCE: 0 0 machine check exceptions
MCP: 10 11 machine check probes
ERROR: 0
SAY: 0
PIN: 0 0 outage notification event posted
NPI: 0 0 Posted Interrupt Nested Event
PIW: 0 0 Posted Interrupt Activation Event
This led me to the i801_smbus module which depends on the i2c_i801 module (I discovered it using lsmod).
After this ~similar~ problem (https:/
I have "modules_
Note: I don't fully understand the consequences of not having the i2C_i801 and i801_smbus modules.
The bet...
Continue reading...
Download full text(4.0 KiB)
I'm not sure if it's fully related, but I would assume at least part of it.
I have two mini servers, one with a Supermicro A2SDi-8C-HLN4F (Atom C3758) and the other with an older Supermicro A1SRM-2758F (Atom C2758F).
I upgraded both from Debian Buster (Kernel 4.19.194-3) to Bullseye (5.10.46-5). There was no problem with the C3758, but I had a serious drop in performance with the C2758F.
When 5.10 is running on the C2758F, /proc/interrupts shows about 100,000 interrupts per second for "IO-APIC 18-fasteoi i801_smbus" and overall performance suffers badly (as compared to 4.19).
So far I've fixed the problem by blacklisting i2c_i801. After finding this I tried adding the disabled_
I don't use jc42 at all, the distribution tools set the sensor thresholds to the correct values.
# i2cdetect -l
# Sensors
nvme-pci-0400
Adapter: PCI-Adapter
Composite: +30.9 °C (Low = -273.1 °C, High = +84.8 °C)
Sensor 1: +30.9 °C (low = -273.1 °C, high = +65261.8 °C)
Sensor 2: +31.9 °C (low = -273.1 °C, high = +65261.8 °C)
coretemp-isa-0000
Adapter: ISA-Adapter
Core 0: +48.0 °C (high = +98.0 °C, critical = +98.0 °C)
Core 1: +48.0 °C (high = +98.0 °C, critical = +98.0 °C)
Core 2: +48.0 °C (high = +98.0 °C, critical = +98.0 °C)
Core 3: +48.0 °C (high = +98.0 °C, critical = +98.0 °C)
Core 4: +47.0 °C (high = +98.0 °C, critical = +98.0 °C)
Core 5: +46.0 °C (high = +98.0 °C, critical = +98.0 °C)
Core 6: +47.0 °C (high = +98.0 °C, critical = +98.0 °C)
Core 7: +47.0 °C (high = +98.0 °C, critical = +98.0 °C)
# dmesg | egrep -i '(smbus|i801)'
[ 2.226240] ismt_smbus 0000:00:13.0: Activate device (0000 -> 0002)
[ 2.229927] i801_smbus 0000:00:1f.3: Activate device (0000 -> 0003)
[2.230089] i801_smbus 0000:00:1f.3: SPD Write Disabled is set
[ 2.230136] i801_smbus 0000:00:1f.3: SMBus uses PCI interrupt
~# lspci-nn | grep SMBus
00:13.0 Peripheral System [0880]: Intel Corporation Atom C2000 SMBus 2.0 Processor [8086:1f15] (Rev. 03)
00:1f.3 SMBus [0c05]: Prozessor Intel Corporation Atom C2000 PCU SMBus [8086:1f3c] (Rev. 03)
# lspci -xxx -s 00:13.0
00:13.0 Systemperipherie: Intel Corporation Atom Processor C2000 SMBus 2.0 (Rev. 03)
00: 86 80 15 1f 06 04 10 00 03 00 80 08 00 00 00 00
10: 04 70 31 df 00 00 00 00 00 00 00 00 00 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 d9 15 20 08
30: 00 00 00 00 40 00 00 00 00 00 00 00 ab 01 00 00
40: 10 80 92 00 01 80 00 10 20 08 04 00 00 00 00 00
50: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
60: 00 00 00 00 10 00 00 00 00 00 00 00 00 00 00 00
70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
80: 01 8c 03 00 00 00 00 00 00 00 00 00 05 00 81 01
90: 04 00 e4 fe 00 00 00 00 21 40 00 00 00 00 00 00
a0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
d0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
e0: 00 00 00 00 00 00 00 00 00 00 01 00 10 00 10 80
f0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
# lspci -xxx -s 00:1f.3
00:1f.3 SMBus: Intel Corporation...
Continue reading...
(In response to stephane.poignant's comment #42)
> I upgraded both to Bullseye from Debian Buster (Kernel 4.19.194-3).
> (5.10.46-5). There was no problem with the C3758 but I had strong performance
> Regression on the C2758F.
>
Interesting, did 4.19 work on the C2758F without breaking the storm?
(In response to Jarkko Nikulas comment #44)
> (In response to comment #42 by stephane.poignant)
> > I upgraded both from Debian Buster (Kernel 4.19.194-3) to Bullseye
> > (5.10.46-5). There was no problem with the C3758 but I had strong performance
> > Regression on the C2758F.
> >
> Interesting, did 4.19 work on the C2758F without breaking the storm?
I didn't check /proc/interrupts when I ran 4.19, so I can't be sure the interrupts weren't there. The drop in performance was definitely not there. I can verify this in a few weeks (remote server with no OOBM network).
Dmesg running 4.19 shows that interrupts were enabled:
[ 0.000000] Linux version 4.19.0-17-amd64 (<hidden email address>) (gcc version 8.3.0 (Debian 8.3.0-6)) #1 SMP Debian 4.19.194-3 (2021 -07-18)
[ 0.000000] Command line: BOOT_IMAGE=
...
[1.434097] Run /init as the startup process
[ 1.782787] dca service started, version 1.12.1
[ 1.783203] ismt_smbus 0000:00:13.0: Activate device (0000 -> 0002)
[1.796694] cryptd: set max_cpu_qlen to 1000
[1.801177] i801_smbus 0000:00:1f.3: Activate device (0000 -> 0003)
[1.801317] i801_smbus 0000:00:1f.3: SPD Write Disabled is set
[ 1.801356] i801_smbus 0000:00:1f.3: SMBus uses PCI interrupt
[ 1.805199] igb: Intel(R) Gigabit Ethernet Network Driver – Version 5.4.0-k
[1.805202] igb: Copyright (c) 2007–2014 Intel Corporation.
[1.805246] igb 0000:00:14.0: activate device (0000 -> 0002)
[1.816722] Enabled SSE version of gcm_enc/dec.
...
The problem persists in kernel 4.19 and other versions. It just depends on another driver triggering the interrupts. If so, they count very high. So it's possible that in 4.19 you didn't have drivers that use these interrupts and as a result the bug wasn't thrown.
@Jarkko Nikula: Since you are still replying, could you please try again and further to get the necessary documents as requested by Jean Delvare?
@Conrad Kostecki: Yes, I agree with you, it's unlikely that the issue wasn't present in 4.19 as it was present much earlier.
I contacted our sales support and they told me that the Atom C2758 is customized with F-Postfix for SuperMicro. Unfortunately, they didn't find an explicit specification for the SMBus controller, but they said it's based on the same 22nm Silvermonth architecture as Bay Trail. I think SMBus IO should be supported.
Unfortunately, the public datasheets for Bay Trails also seem sparse, but I was able to find something by searching datasheets for the Bay Trail E3825 used on the MinnowBoard Max. The following document appears to be available to registered users of ark.intel.com or search engines:
"Intel Atom ® Processor E3800 Product Family" with document number: 538136 and Chapter 33 "PCU – System Management Bus (SMBus)"
Download full text(4.0 KiB)
Here is the output:
pcicst 0x298, SMBHSTSTS 0x60
[359.205884] i801_smbus 0000:00:1f.3: pcicst 0x298, SMBHSTSTS 0x60
[359.205918] i801_smbus 0000:00:1f.3: pcicst 0x298, SMBHSTSTS 0x60
[ 364.210031] i801_isr: 375367 suppressed callbacks
[364.210043] i801_smbus 0000:00:1f.3: pcicst 0x298, SMBHSTSTS 0x60
[364.210085] i801_smbus 0000:00:1f.3: pcicst 0x298, SMBHSTSTS 0x60
[364.210126] i801_smbus 0000:00:1f.3: pcicst 0x298, SMBHSTSTS 0x60
[364.210142] i801_smbus 0000:00:1f.3: pcicst 0x298, SMBHSTSTS 0x60
[364.210178] i801_smbus 0000:00:1f.3: pcicst 0x298, SMBHSTSTS 0x60
[364.210217] i801_smbus 0000:00:1f.3: pcicst 0x298, SMBHSTSTS 0x60
[364.210234] i801_smbus 0000:00:1f.3: pcicst 0x298, SMBHSTSTS 0x60
[364.210253] i801_smbus 0000:00:1f.3: pcicst 0x298, SMBHSTSTS 0x60
[364.210292] i801_smbus 0000:00:1f.3: pcicst 0x298, SMBHSTSTS 0x60
[364.210329] i801_smbus 0000:00:1f.3: pcicst 0x298, SMBHSTSTS 0x60
[ 369.220035] i801_isr: 380909 suppressed callbacks
[369.220047] i801_smbus 0000:00:1f.3: pcicst 0x298, SMBHSTSTS 0x60
[369.220069] i801_smbus 0000:00:1f.3: pcicst 0x298, SMBHSTSTS 0x60
[369.220109] i801_smbus 0000:00:1f.3: pcicst 0x298, SMBHSTSTS 0x60
[369.220146] i801_smbus 0000:00:1f.3: pcicst 0x298, SMBHSTSTS 0x60
[369.220185] i801_smbus 0000:00:1f.3: pcicst 0x298, SMBHSTSTS 0x60
[369.220222] i801_smbus 0000:00:1f.3: pcicst 0x298, SMBHSTSTS 0x60
[369.220262] i801_smbus 0000:00:1f.3: pcicst 0x298, SMBHSTSTS 0x60
[369.220278] i801_smbus 0000:00:1f.3: pcicst 0x298, SMBHSTSTS 0x60
[369.220317] i801_smbus 0000:00:1f.3: pcicst 0x298, SMBHSTSTS 0x60
[369.220333] i801_smbus 0000:00:1f.3: pcicst 0x298, SMBHSTSTS 0x60
[ 374.230078] i801_isr: 393736 suppressed callbacks
[374.230109] i801_smbus 0000:00:1f.3: pcicst 0x298, SMBHSTSTS 0x60
[374.230151] i801_smbus 0000:00:1f.3: pcicst 0x298, SMBHSTSTS 0x60
[374.230191] i801_smbus 0000:00:1f.3: pcicst 0x298, SMBHSTSTS 0x60
[374.230210] i801_smbus 0000:00:1f.3: pcicst 0x298, SMBHSTSTS 0x60
[374.230248] i801_smbus 0000:00:1f.3: pcicst 0x298, SMBHSTSTS 0x60
[374.230283] i801_smbus 0000:00:1f.3: pcicst 0x298, SMBHSTSTS 0x60
[374.230297] i801_smbus 0000:00:1f.3: pcicst 0x298, SMBHSTSTS 0x60
[374.230332] i801_smbus 0000:00:1f.3: pcicst 0x298, SMBHSTSTS 0x60
[374.230345] i801_smbus 0000:00:1f.3: pcicst 0x298, SMBHSTSTS 0x60
[374.230358] i801_smbus 0000:00:1f.3: pcicst 0x298, SMBHSTSTS 0x60
[ 379.240037] i801_isr: 382705 suppressed callbacks
[379.240068] i801_smbus 0000:00:1f.3: pcicst 0x298, SMBHSTSTS 0x60
[379.240090] i801_smbus 0000:00:1f.3: pcicst 0x298, SMBHSTSTS 0x60
[379.240110] i801_smbus 0000:00:1f.3: pcicst 0x298, SMBHSTSTS 0x60
[379.240130] i801_smbus 0000:00:1f.3: pcicst 0x298, SMBHSTSTS 0x60
[379.240150] i801_smbus 0000:00:1f.3: pcicst 0x298, SMBHSTSTS 0x60
[379.240186] i801_smbus 0000:00:1f.3: pcicst 0x298, SMBHSTSTS 0x60
[379.240205] i801_smbus 0000:00:1f.3: pcicst 0x298, SMBHSTSTS 0x60
[379.240242] i801_smbus 0000:00:1f.3: pcicst 0x298, SMBHSTSTS 0x60
[379.240281] i801_smbus 0000:00:1f.3: pcicst 0x298, SMBHSTSTS 0x60
[379.240297] i801_smbus 0000:00:1f.3: pcicst 0x298, SMBHSTSTS 0x60
[384.250032] i801_isr:387109 callback...
Continue reading...
I just tested the patch and can confirm that it works. After applying the patch, interrupts on i801_smbus dropped to almost zero.
(In response to Conrad Kostecki's comment #55)
> I just tested the patch and can confirm that it works. After applying the patch
> Interrupts on i801_smbus dropped to almost zero.
According to the specification, the host (if ALERT is implemented) must issue a special read-byte command to see which device wants to send something. If proper implementation doesn't fix this, it could be a pinout issue (e.g. a dropout pin sitting on the appropriate pin) or it could be PCB or firmware (BIOS) issues.
It would be nice to understand, if it's possible without much effort, what exactly constitutes the ALERT.
Download full text(5.7 KiB)
I can confirm that I get the same results with both patches in my setup with Debian kernels.
The debug patch produces the same messages, and the disable patch SMB_ALERT no longer triggered an interrupt.
Also when booting into the old kernel you were using (linux-
We will test the second version of the patch as soon as possible and provide you with the results.
## Kern 4.16
# uname -a
Linux hrbpsrv01.intra.lan 4.19.0-17-amd64 #1 SMP Debian 4.19.194-3 (2021-07-18) x86_64 GNU/Linux
# cat /proc/interrupts | grep i801
18: 0 0 0 0 0 0 0 0 IO-APIC 18-fasteoi i801_smbus
# dmesg
...
[6652.023634] i801_smbus 0000:00:1f.3: SPD Write Disabled is set
[ 6652.023689] i801_smbus 0000:00:1f.3: SMBus uses PCI interrupt
...
## Debian Linux-Imagen-
# uname -a
Linux hrbpsrv01.intra.lan 5.10.0-9-amd64 #1 SMP Debian 5.10.70-1 (2021-09-30) x86_64 GNU/Linux
# cat /proc/interrupts | grep i801
18: 0 0 0 0 0 7358862 0 0 IO-APIC 18-fasteoi i801_smbus
(increase to about 100,000 interrupts/sec.)
# dmesg
...
[516.429120] i801_smbus 0000:00:1f.3: SPD Write Disabled is set
[ 516.429140] i801_smbus 0000:00:1f.3: Interrupt pending!
[ 516.429161] i801_smbus 0000:00:1f.3: SMBus uses PCI interrupt
[ 516.429933] i2c i2c-1: 4/4 memory locations occupied (from DMI)
[ 516.430337] at24 1-0050: Vdc supply not found because dummy controller is used
[ 516.431043] at24 1-0050: EEPROM with 256 bytes speed, read only
[516.431078] i2c i2c-1: SPD successfully instantiated at 0x50
[516.431455] at24 1-0051: VCC supply not found, with dummy regulator
[ 516.432148] at24 1-0051: EEPROM with 256 bytes speed, read only
[516.432174] i2c i2c-1: SPD successfully instantiated at 0x51
[516.432576] at24 1-0052: VCC supply not found, with dummy regulator
[ 516.433284] at24 1-0052: EEPROM with 256 bytes speed, read only
[516.433325] i2c i2c-1: SPD successfully instantiated at 0x52
[ 516.433748] at24 1-0053: Vdc supply not found because dummy controller is used
[ 516.434454] at24 1-0053: EEPROM speed 256 bytes, read only
[516.434497] i2c i2c-1: SPD successfully instantiated at 0x53
[525.513104] i801_smbus 0000:00:1f.3: pcicst 0x298, SMBHSTSTS 0x60
[525.513133] i801_smbus 0000:00:1f.3: pcicst 0x298, SMBHSTSTS 0x60
[525.513161] i801_smbus 0000:00:1f.3: pcicst 0x298, SMBHSTSTS 0x60
[525.513185] i801_smbus 0000:00:1f.3: pcicst 0x298, SMBHSTSTS 0x60
[525.513209] i801_smbus 0000:00:1f.3: pcicst 0x298, SMBHSTSTS 0x60
[525.513234] i801_smbus 0000:00:1f.3: pcicst 0x298, SMBHSTSTS 0x60
[525.513258] i801_smbus 0000:00:1f.3: pcicst 0x298, SMBHSTSTS 0x60
[525.513281] i801_smbus 0000:00:1f.3: pcicst 0x298, SMBHSTSTS 0x60
[525.513316] i801_smbus 0000:00:1f.3: pcicst 0x298, SMBHSTSTS 0x60
[525.513352] i801_smbus 0000:00:1f.3: pcicst 0x298, SMBHSTSTS 0x60
[ 530.514207] i801_isr: 297603 suppressed callbacks
[ 530,5...
Continue reading...
(In response to Andy Shevchenko's comment #59)
> (In reply to Jarkko Nikulas comment #58)
> > 2nd version of the patch that disables the SMB_ALERT signal
>
Sidenote: Looking at this code, shouldn't I clean the stream first?
> Activate notifications and only then IRQ?
That's a good question and got me into debugging more. In fact, disabling it doesn't disable detection, and SMBALERT_STS is set, causing short bursts of interrupts during driver load and unload time when the SMB_ALERT signal has been asserted. It seems it's better to add some basic detection in i801_isr().
I'm not sure if removing pending interrupts at query time will result in regression, but asserting SMBALERT_STS in i801_isr() ensures that the state doesn't persist forever if it occurs after the query.
(In response to Jarkko Nikulas comment #63)
> (In response to Andy Shevchenko's comment #59)
> > (In response to Jarkko Nikulas comment #58)
> > > 2nd version of the patch that disables the SMB_ALERT signal
> >
> > Side note: if you look at this code, you shouldn't clean up the stream first
> > Activate notifications and only then IRQ?
>
> That's a good question and got me into debugging more. It really doesn't turn off
> Disable detection and SMBALERT_STS is set causing a short burst of
> Interrupts during controller upload and download time when the SMB_ALERT signal was
> specified. It seems better to add some basic confirmation
> i801_isr().
>
I'm not sure if clearing pending interrupts at check time will cause any
> regression, but detecting the SMBALERT_STS in i801_isr() ensures that the
> the state does not persist forever if it occurs after the test.
It also makes sense to try it with DEBUG_SHIRQ enabled (yes, I know more than half of the drivers in the Linux kernel fail or misbehave when doing this, not many developers know about the debugging feature).
New best solution instead of blacklisting i2c-i801 keep it but disable interrupts and use polling instead.
Step 1 Temporary and power to boot for Step 2).
a. Go to the Grub menu to boot (press ESC once while booting)
b. Select the (Ubuntu)Linux entry you want to boot into and press "e" to edit it.
C. Edit the line that starts with " linux /boot/vmlinuz ...".
i.e. At the end of this line add "i2c-i801.
me. press F10
Now the machine starts with this new parameter of the i2c-i801 module. This only happens once, the next launch will be without this parameter (unless you add it again manually by repeating the steps above).
Step 2 After booting and logging in, do it last:
a. Run sudo vi /etc/modprobe.
b. Add the line "options i2c-i801 disable_
c To ensure it is used at boot: "sudo update-initramfs -u"
This better solution still loads the i2c-i801 module, but it uses probes instead of interrupts. I think that's better than no i2c-i801.
I can boot, the problem does not occur.
Another solution...
I can't create logs with appport-collect because in this case the startup is too slow.
Therefore, change the status to Confirmed.
This bug is believed to have been fixed in kernel v5.16 by the following 2 commits:
Kommer 03a976c9afb5e3c
Author: Jarkko Nikula
Date: Wednesday 17 November 11:45:09 2021 +0200
i2c: i801: Fixed SMB_ALERT signal failure storm
Kommer 9b5bf5878138293
Author: Jean Delvare
Date: Tuesday 9 November 16:02:57 2021 +0100
i2c: i801: restore INTREN on download
This message is a reminder that Fedora Linux 34 is nearing the end of its useful life.
Fedora will stop maintaining and releasing updates for Fedora Linux 34 on 06/07/2022.
It is Fedora's policy to close all bug reports for versions that are no longer available.
held At this point, if left open with a , this bug will be closed as EOL
'version' of '34'.
Package Maintainer: If you want this bug to stay open because
plan to fix it in a currently maintained release, change the 'version'
to a newer version of Fedora Linux.
Thank you for reporting this issue. We are sorry we did not do this.
able to fix it before Fedora Linux 34 ends its useful life. if you still want
to see this bug fixed and to be able to reproduce it in a later version
of Fedora Linux, we recommend you to change the 'Version' to a newer version
before this error is closed.