Using linux iTCO_wdt driver on i3-7100

Using linux iTCO_wdt driver on i3-7100


Hi,

I’m trying to get the TCO watchdog to work on i3-7100 on ASUS Desktop PC VivoMini VC66-B018Z (https://www.asus.com/us/Mini-PCs/VivoMini-VC66/).

I have attached the output of dmidecode to this email.

I’m using a homebrew linux distribution with linux kernel 4.9.22 and I have been using the TCO driver with this software stack on other hardware platforms without any issues.

On the Asus VC66 when insert the TCO watchdog driver I get the following logs:

[15321.181802] iTCO_wdt: Intel TCO WatchDog Timer Driver v1.11
[15321.181939] iTCO_wdt: Found a Intel PCH TCO device (Version=4, TCOBASE=0x0400)
[15321.182061] iTCO_wdt: initialized. heartbeat=30 sec (nowayout=0)

But then when I program a timeout and stop pinging the watchdog the unit does not reboot. Nothing happens, the unit continues to operates normally.

I have investigated and found out that the TCO_RLD register does not decrement.

When I cat /sys/class/watchdog/watchdog0/timeleft I always get the same value (the value that I programmed in timeout)

I have added a couple of prints in the driver to display the important TCO related registers in the set timeout function. Here are the values:

[ 7732.805330] iTCO_wdt: SMI_EN = 0x90002023
[ 7732.805335] iTCO_wdt: GC = 0x0
[ 7732.805338] iTCO_wdt: TCO_RLD = 0x32
[ 7732.805342] iTCO_wdt: TCO1_STS = 0x0
[ 7732.805346] iTCO_wdt: TCO2_STS = 0x0
[ 7732.805349] iTCO_wdt: TCO1_CNT = 0x1000
[ 7732.805353] iTCO_wdt: TCO2_CNT = 0x8
[ 7732.805357] iTCO_wdt: TCOv2_TMR = 0x32

When I check these values against the TCO spec they look good to me. (i.e.: NR bit is cleared in GC register, TCO_EN is set in SMI_EN register, TCO_TMR_HALT is cleared in TCO1_CNT)

Am I missing something?

Is there anything else that I could check to find out why the TCO timer is not counting down?

Can anyone help me with this issue?

Is it the appropriate forum for this question?

BR,
Pierre

AttachmentSize
Downloadtext/plain dmidecode.txt20.9 KB
11 posts / 0 new

Hey Pierre

The hardware we are discussing complies to ACPI specifications, as such I beleive the following link should answer your questions

http://msdn.microsoft.com/en-us/windows/hardware/gg463320.aspx

Joe


Joe,

I will look at the document you are referring to and I will get back to you.

Thanks!

Pierre


Joe,

I took a close look at the documentation that you have suggested.

This is not what I'm looking for.

I want to use the TCO watchdog that is a part of the intel 100 series chipset PCH.

I use Intel datasheets 100-series-chipset-datasheet-vol-1.pdf & 100-series-chipset-datasheet-vol-2.pdf

BR,

Pierre

 


I am trying to find an Linux watchdog expert, I will get back to you as soon as I can,


Hey Joe,

Any update on this?

BR,

Pierre


Hey Joseph,

we have the same problem like Pierre explained.

In our case the hardware is an Intel NUC7i5BNK.

Have you found a solution for this Problem?

BR.
Jonas


We also are having this same issue running Ubuntu 18.04LTS on NUC7i3BNK. Everything looks fine in dmesg etc., but the counter just never starts, so the server never reboots when the watchdog should tell it to. Not having much luck solving it... any help gratefully received!


Any movement on this as the problem;  It is still there from what I can tell.

I'm using the 4.18.16,  4.20.16, and 5.1.11 kernels on a fedora 29 distro.  This watchdog hasn't worked since moving to Skylake devices (working in Sandybridge, Ivybridge and before that). It's broken on Cannon Lake and Coffee lake devices as well.  There are no idicatations of an error other than that fact that the timer countdown will not start (Timeleft) and the watchdog does not reset the device. 

[    4.206354] i801_smbus 0000:00:1f.4: enabling device (0001 -> 0003)
[    4.206825] i801_smbus 0000:00:1f.4: SPD Write Disable is set
[    4.206858] i801_smbus 0000:00:1f.4: SMBus using PCI interrupt

[  168.218443] iTCO_vendor_support: vendor-support=0
[  168.221443] iTCO_wdt: Intel TCO WatchDog Timer Driver v1.11
[  168.221510] iTCO_wdt: Found a Intel PCH TCO device (Version=4, TCOBASE=0x0400)
[  168.221985] iTCO_wdt: initialized. heartbeat=30 sec (nowayout=0)

~# wdctl /dev/watchdog0
wdctl: write failed: Invalid argument
Device:        /dev/watchdog0
Identity:      iTCO_wdt [version 0]
Timeout:       30 seconds
Pre-timeout:    0 seconds
Timeleft:      30 seconds
FLAG           DESCRIPTION               STATUS BOOT-STATUS
KEEPALIVEPING  Keep alive ping reply          1           0
MAGICCLOSE     Supports magic close char      0           0
SETTIMEOUT     Set timeout (in seconds)       0           0

 

Any help understanding why this is so would be greatly appreciated.  Are there any known actual workarounds.  It's hard to believe this issue isn't already known well in the embedded linux community.

Regards,
 

Mark


Hey,

are there any updates on this topic?

We have the same problem (dmesg output looks fine but watchdog is not counting down) on different Kaby Lake CPUs (i3, i5 and i7 NUC7 models) and different kernels (4.15, 4.19, 5.3, 5.4 rc5)

Regards,

Thomas


I couldn't find a subject matter expert on this when I researched it last time. 

Might also try the community forums and ask the question over there.

Leave a Comment

Please sign in to add a comment. Not a member? Join today