Inodes that were part of a corrupted orphan linked list found. How to understand what caused it and how to...












0















I bought a Centurion Nano from the now defunct Alpha Computers, it ships with Alpha OS (that is essentially a tampered Ubuntu):



$ cat /etc/os-release
NAME="Alpha OS"
VERSION="1.0.0 Polaris"
ID="alpha-os"
ID_LIKE=ubuntu
PRETTY_NAME="Alpha OS 1.0.0 Polaris"
VERSION_ID="1.0.0"
HOME_URL="https://alpha.store/"
SUPPORT_URL="https://alpha.store/forums/forum/alpha-product-discussion/"
BUG_REPORT_URL="https://alpha.store/forums/forum/alpha-product-discussion/"
VERSION_CODENAME=polaris
UBUNTU_CODENAME=polaris
$ uname -a
Linux centurion 4.15.0-29-generic #31~16.04.1-Ubuntu SMP Wed Jul 18 08:54:04 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux


Today, after booting up, I noticed that my / mount was read-only, I rebooted and got this message:



Inodes that were part of a corrupted orphan linked list found.
UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.


on /dev/sdb2. Since this is the second time it happens in 1 month I'd like to understand what might be causing it and how do I make sure it doesn't happen again?
The first time I think the system hung up at shutdown and I powered it off. This time the shutdown was completed successfully (or so I thought).



Here more details about the drive:



dat@centurion:~$ sudo hdparm -I /dev/sdb

/dev/sdb:

ATA device, with non-removable media
Model Number: Lenovo SSD SL700 M.2 128G
Serial Number: B0E1077A19DD00000503
Firmware Revision: SBFM51.2
Transport: Serial, ATA8-AST, SATA 1.0a, SATA II Extensions, SATA Rev 2.5, SATA Rev 2.6, SATA Rev 3.0
Standards:
Supported: 11 10 9 8 7 6 5
Likely used: 11
Configuration:
Logical max current
cylinders 16383 16383
heads 16 16
sectors/track 63 63
--
CHS current addressable sectors: 16514064
LBA user addressable sectors: 250069680
LBA48 user addressable sectors: 250069680
Logical Sector size: 512 bytes
Physical Sector size: 512 bytes
Logical Sector-0 offset: 0 bytes
device size with M = 1024*1024: 122104 MBytes
device size with M = 1000*1000: 128035 MBytes (128 GB)
cache/buffer size = unknown
Form Factor: less than 1.8 inch
Nominal Media Rotation Rate: Solid State Device
Capabilities:
LBA, IORDY(can be disabled)
Queue depth: 32
Standby timer values: spec'd by Standard, no device specific minimum
R/W multiple sector transfer: Max = 16 Current = 16
DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5 *udma6
Cycle time: min=120ns recommended=120ns
PIO: pio0 pio1 pio2 pio3 pio4
Cycle time: no flow control=120ns IORDY flow control=120ns
Commands/features:
Enabled Supported:
* SMART feature set
Security Mode feature set
* Power Management feature set
* Write cache
* Look-ahead
* Host Protected Area feature set
* WRITE_BUFFER command
* READ_BUFFER command
* NOP cmd
* DOWNLOAD_MICROCODE
SET_MAX security extension
* 48-bit Address feature set
* Device Configuration Overlay feature set
* Mandatory FLUSH_CACHE
* FLUSH_CACHE_EXT
* SMART error logging
* SMART self-test
* General Purpose Logging feature set
* WRITE_{DMA|MULTIPLE}_FUA_EXT
* 64-bit World wide name
* WRITE_UNCORRECTABLE_EXT command
* {READ,WRITE}_DMA_EXT_GPL commands
* Segmented DOWNLOAD_MICROCODE
* Gen1 signaling speed (1.5Gb/s)
* Gen2 signaling speed (3.0Gb/s)
* Gen3 signaling speed (6.0Gb/s)
* Native Command Queueing (NCQ)
* Phy event counters
* READ_LOG_DMA_EXT equivalent to READ_LOG_EXT
* DMA Setup Auto-Activate optimization
Device-initiated interface power management
* Software settings preservation
* DOWNLOAD MICROCODE DMA command
* SET MAX SETPASSWORD/UNLOCK DMA commands
* WRITE BUFFER DMA command
* READ BUFFER DMA command
* DEVICE CONFIGURATION SET/IDENTIFY DMA commands
* Data Set Management TRIM supported (limit 8 blocks)
Security:
Master password revision code = 65534
supported
not enabled
not locked
frozen
not expired: security count
supported: enhanced erase
20min for SECURITY ERASE UNIT. 60min for ENHANCED SECURITY ERASE UNIT.
Logical Unit WWN Device Identifier: 0000000000000000
NAA : 0
IEEE OUI : 000000
Unique ID : 000000000
Checksum: correct


Partition mounted as ext4



dat@centurion:~$ blkid /dev/sdb2 
/dev/sdb2: UUID="3fd4075e-6d86-4535-9db6-f78b29f942e8" TYPE="ext4" PARTUUID="b4da84e6-2d39-4a40-b732-581a79ae72af"
dat@centurion:~$ cat /etc/mtab | grep sdb2
/dev/sdb2 / ext4 rw,relatime,errors=remount-ro,data=ordered 0 0


with an ecrypted home directory



dat@centurion:~$ cat /etc/mtab | grep home
/home/dat/.Private /home/dat ecryptfs rw,nosuid,nodev,relatime,ecryptfs_fnek_sig=sumtin,ecryptfs_sig=sumtinelse,ecryptfs_cipher=aes,ecryptfs_key_bytes=16,ecryptfs_unlink_sigs 0 0


And here the details of the recovery process



full recovery process



SMART (and non SMART) values:



dat@centurion:~$ sudo smartctl -x /dev/sdb
smartctl 6.5 2016-01-24 r4214 [x86_64-linux-4.15.0-29-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model: Lenovo SSD SL700 M.2 128G
Serial Number: B0E1077A19DD00000503
LU WWN Device Id: 0 000000 000000000
Firmware Version: SBFM51.2
User Capacity: 128,035,676,160 bytes [128 GB]
Sector Size: 512 bytes logical/physical
Rotation Rate: Solid State Device
Form Factor: < 1.8 inches
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: Unknown(0x0ff8) (minor revision not indicated)
SATA Version is: SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Wed Oct 10 11:58:55 2018 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
AAM feature is: Unavailable
APM feature is: Unavailable
Rd look-ahead is: Enabled
Write cache is: Enabled
ATA Security is: Disabled, frozen [SEC2]
Wt Cache Reorder: Unavailable

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (65535) seconds.
Offline data collection
capabilities: (0x79) SMART execute Offline immediate.
No Auto Offline data collection support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 30) minutes.
Conveyance self-test routine
recommended polling time: ( 6) minutes.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE
1 Raw_Read_Error_Rate PO-R-- 100 100 050 - 0
9 Power_On_Hours -O--C- 100 100 000 - 2404
12 Power_Cycle_Count -O--C- 100 100 000 - 283
168 Unknown_Attribute -O--C- 100 100 000 - 0
170 Unknown_Attribute PO---- 094 094 010 - 76
173 Unknown_Attribute -O--C- 100 100 000 - 1769532
192 Power-Off_Retract_Count -O--C- 100 100 000 - 36
194 Temperature_Celsius PO---K 067 067 000 - 33 (Min/Max 33/33)
218 Unknown_Attribute PO-R-- 100 100 050 - 0
231 Temperature_Celsius PO--C- 100 100 000 - 97
241 Total_LBAs_Written -O--C- 100 100 000 - 1901
||||||_ K auto-keep
|||||__ C event count
||||___ R error rate
|||____ S speed/performance
||_____ O updated online
|______ P prefailure warning

General Purpose Log Directory Version 1
SMART Log Directory Version 1 [multi-sector log support]
Address Access R/W Size Description
0x00 GPL,SL R/O 1 Log Directory
0x01 SL R/O 1 Summary SMART error log
0x02 SL R/O 51 Comprehensive SMART error log
0x03 GPL R/O 64 Ext. Comprehensive SMART error log
0x04 GPL,SL R/O 8 Device Statistics log
0x06 SL R/O 1 SMART self-test log
0x07 GPL R/O 1 Extended self-test log
0x09 SL R/W 1 Selective self-test log
0x10 GPL R/O 1 SATA NCQ Queued Error log
0x11 GPL R/O 1 SATA Phy Event Counters log
0x30 GPL,SL R/O 9 IDENTIFY DEVICE data log
0x80-0x9f GPL,SL R/W 16 Host vendor specific log

SMART Extended Comprehensive Error Log Version: 1 (64 sectors)
No Errors Logged

SMART Extended Self-test Log Version: 1 (1 sectors)
No self-tests have been logged. [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 0
Note: revision number not 1 implies that no selective self-test has ever been run
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

SCT Commands not supported

Device Statistics (GP Log 0x04)
Page Offset Size Value Flags Description
0x01 ===== = = === == General Statistics (rev 1) ==
0x01 0x008 4 283 --- Lifetime Power-On Resets
0x01 0x010 4 2404 --- Power-on Hours
0x01 0x018 6 3987986978 --- Logical Sectors Written
0x01 0x028 6 1577724785 --- Logical Sectors Read
0x04 ===== = = === == General Errors Statistics (rev 1) ==
0x04 0x008 4 0 --- Number of Reported Uncorrectable Errors
0x05 ===== = = === == Temperature Statistics (rev 1) ==
0x05 0x008 1 33 --- Current Temperature
0x05 0x020 1 33 --- Highest Temperature
0x05 0x028 1 33 --- Lowest Temperature
0x06 ===== = = === == Transport Statistics (rev 1) ==
0x06 0x018 4 0 --- Number of Interface CRC Errors
0x07 ===== = = === == Solid State Device Statistics (rev 1) ==
0x07 0x008 1 2 --- Percentage Used Endurance Indicator
|||_ C monitored condition met
||__ D supports DSN
|___ N normalized value

SATA Phy Event Counters (GP Log 0x11)
ID Size Value Description
0x0001 2 0 Command failed due to ICRC error
0x0003 2 0 R_ERR response for device-to-host data FIS
0x0004 2 0 R_ERR response for host-to-device data FIS
0x0006 2 0 R_ERR response for device-to-host non-data FIS
0x0007 2 0 R_ERR response for host-to-device non-data FIS
0x0008 2 0 Device-to-host non-data FIS retries
0x0009 4 2 Transition from drive PhyRdy to drive PhyNRdy
0x000a 4 2 Device-to-host register FISes sent due to a COMRESET
0x000f 2 0 R_ERR response for host-to-device data FIS, CRC
0x0010 2 0 R_ERR response for host-to-device data FIS, non-CRC
0x0012 2 0 R_ERR response for host-to-device non-data FIS, CRC
0x0013 2 0 R_ERR response for host-to-device non-data FIS, non-CRC


In syslog I can see an entry for sdb2 remounted but I'm not sure how to interpret it, can't find anything else that looks relevant to me



Oct  9 10:21:38 centurion kernel: [    2.621017] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Oct 9 10:21:38 centurion kernel: [ 2.621040] ata3: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Oct 9 10:21:38 centurion kernel: [ 2.621064] ata2: SATA link down (SStatus 4 SControl 300)
Oct 9 10:21:38 centurion kernel: [ 2.621258] ata3.00: ATA-11: Lenovo SSD SL700 M.2 128G, SBFM51.2, max UDMA/133
Oct 9 10:21:38 centurion kernel: [ 2.621259] ata3.00: 250069680 sectors, multi 16: LBA48 NCQ (depth 31/32), AA
Oct 9 10:21:38 centurion kernel: [ 2.621479] ata3.00: configured for UDMA/133
Oct 9 10:21:38 centurion kernel: [ 2.621588] ata1.00: ATA-10: HGST HTS541010B7E610, 01.01A01, max UDMA/133
Oct 9 10:21:38 centurion kernel: [ 2.621589] ata1.00: 1953525168 sectors, multi 16: LBA48 NCQ (depth 31/32), AA
Oct 9 10:21:38 centurion kernel: [ 2.622197] ata1.00: configured for UDMA/133
Oct 9 10:21:38 centurion kernel: [ 2.622455] scsi 0:0:0:0: Direct-Access ATA HGST HTS541010B7 1A01 PQ: 0 ANSI: 5
Oct 9 10:21:38 centurion kernel: [ 2.622683] sd 0:0:0:0: [sda] 1953525168 512-byte logical blocks: (1.00 TB/932 GiB)
Oct 9 10:21:38 centurion kernel: [ 2.622684] sd 0:0:0:0: [sda] 4096-byte physical blocks
Oct 9 10:21:38 centurion kernel: [ 2.622692] sd 0:0:0:0: [sda] Write Protect is off
Oct 9 10:21:38 centurion kernel: [ 2.622693] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
Oct 9 10:21:38 centurion kernel: [ 2.622699] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Oct 9 10:21:38 centurion kernel: [ 2.622725] sd 0:0:0:0: Attached scsi generic sg0 type 0
Oct 9 10:21:38 centurion kernel: [ 2.622957] scsi 2:0:0:0: Direct-Access ATA Lenovo SSD SL700 51.2 PQ: 0 ANSI: 5
Oct 9 10:21:38 centurion kernel: [ 2.623168] sd 2:0:0:0: Attached scsi generic sg1 type 0
Oct 9 10:21:38 centurion kernel: [ 2.623280] sd 2:0:0:0: [sdb] 250069680 512-byte logical blocks: (128 GB/119 GiB)
Oct 9 10:21:38 centurion kernel: [ 2.623337] sd 2:0:0:0: [sdb] Write Protect is off
Oct 9 10:21:38 centurion kernel: [ 2.623338] sd 2:0:0:0: [sdb] Mode Sense: 00 3a 00 00
Oct 9 10:21:38 centurion kernel: [ 2.623379] sd 2:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Oct 9 10:21:38 centurion kernel: [ 2.641154] sda: sda1
Oct 9 10:21:38 centurion kernel: [ 2.641429] sd 0:0:0:0: [sda] Attached SCSI disk
Oct 9 10:21:38 centurion kernel: [ 2.655999] sdb: sdb1 sdb2 sdb3
Oct 9 10:21:38 centurion kernel: [ 2.657197] sd 2:0:0:0: [sdb] Attached SCSI disk
Oct 9 10:21:38 centurion kernel: [ 2.976451] clocksource: Switched to clocksource tsc
Oct 9 10:21:38 centurion kernel: [ 3.487633] Console: switching to colour frame buffer device 240x67
Oct 9 10:21:38 centurion kernel: [ 3.507287] i915 0000:00:02.0: fb0: inteldrmfb frame buffer device
Oct 9 10:21:38 centurion kernel: [ 3.547895] random: fast init done
Oct 9 10:21:38 centurion kernel: [ 3.634734] psmouse serio1: elantech: assuming hardware version 4 (with firmware version 0x361f00)
Oct 9 10:21:38 centurion kernel: [ 3.674405] psmouse serio1: elantech: Synaptics capabilities query result 0x00, 0x16, 0x0d.
Oct 9 10:21:38 centurion kernel: [ 3.740007] raid6: sse2x1 gen() 10059 MB/s
Oct 9 10:21:38 centurion kernel: [ 3.788005] raid6: sse2x1 xor() 6131 MB/s
Oct 9 10:21:38 centurion kernel: [ 3.808299] [drm] RC6 on
Oct 9 10:21:38 centurion kernel: [ 3.836004] raid6: sse2x2 gen() 12046 MB/s
Oct 9 10:21:38 centurion kernel: [ 3.884002] raid6: sse2x2 xor() 8275 MB/s
Oct 9 10:21:38 centurion kernel: [ 3.932005] raid6: sse2x4 gen() 13873 MB/s
Oct 9 10:21:38 centurion kernel: [ 3.980004] raid6: sse2x4 xor() 9533 MB/s
Oct 9 10:21:38 centurion kernel: [ 4.028005] raid6: avx2x1 gen() 23736 MB/s
Oct 9 10:21:38 centurion kernel: [ 4.076004] raid6: avx2x1 xor() 17173 MB/s
Oct 9 10:21:38 centurion kernel: [ 4.124002] raid6: avx2x2 gen() 27103 MB/s
Oct 9 10:21:38 centurion kernel: [ 4.172003] raid6: avx2x2 xor() 18831 MB/s
Oct 9 10:21:38 centurion kernel: [ 4.220003] raid6: avx2x4 gen() 30098 MB/s
Oct 9 10:21:38 centurion kernel: [ 4.268004] raid6: avx2x4 xor() 22359 MB/s
Oct 9 10:21:38 centurion kernel: [ 4.268701] raid6: using algorithm avx2x4 gen() 30098 MB/s
Oct 9 10:21:38 centurion kernel: [ 4.269390] raid6: .... xor() 22359 MB/s, rmw enabled
Oct 9 10:21:38 centurion kernel: [ 4.270077] raid6: using avx2x2 recovery algorithm
Oct 9 10:21:38 centurion kernel: [ 4.270769] psmouse serio1: elantech: Elan sample query result 00, 49, 75
Oct 9 10:21:38 centurion kernel: [ 4.273643] xor: automatically using best checksumming function avx
Oct 9 10:21:38 centurion kernel: [ 4.284699] Btrfs loaded, crc32c=crc32c-intel
Oct 9 10:21:38 centurion kernel: [ 4.506433] input: ETPS/2 Elantech Touchpad as /devices/platform/i8042/serio1/input/input6
Oct 9 10:21:38 centurion kernel: [ 9.433983] EXT4-fs (sdb2): mounted filesystem with ordered data mode. Opts: (null)
Oct 9 10:21:38 centurion kernel: [ 10.700673] Lockdown: /dev/mem,kmem,port is restricted; see man kernel_lockdown.7
Oct 9 10:21:38 centurion kernel: [ 12.663600] lp: driver loaded but no devices found
Oct 9 10:21:38 centurion kernel: [ 12.790174] ppdev: user-space parallel port driver
Oct 9 10:21:38 centurion kernel: [ 15.800260] EXT4-fs (sdb2): re-mounted. Opts: errors=remount-ro




UPDATE



This is still happening, I've taken the following random measures in the hope of fixing the situation but to no avail:




  • removed the encrypted user home (now the entire disk is a plain ext4 with no ecryptfs)

  • removed encryption from swap and moved to a swapfile on a different disk

  • updated the kernel to 4.15.0.42.63 amd64 [from: 4.15.0.29.51]


I feel like the problem happens when the system is overloaded but it could easily be the other way around (errors -> read-only fs -> chrome and other apps feel slow).










share|improve this question




















  • 1





    It means the filesystem was left in an inconsistent state when you powered off, for example some blocks were/could not be written. Look in your syslog for error messages relating to the disk. What is on /dev/sdb2? If you have smartctl, get the SMART values.

    – dirkt
    Oct 10 '18 at 6:18











  • @dirkt I'm sure that in one instance I powered off the machine but I'm also sure that in the other instance a proper shutdown had happened. I've added details about sdb2 (it mounts the root fs "/"), SMART (what should I be looking for in there?), and syslog (I'm unsure if that "re-mounted" is normal or not, looks normal since I see it every day???).

    – Arjuna Del Toso
    Oct 10 '18 at 19:14






  • 1





    SMART attributes look good (100 is nominal, lower is worse). Don't look only at the newest syslog, look at older ones (you should have a logrotate). You want the error(s) that happened before it discovered the problem on reboot. If there are no errors (quite possible given the good SMART values), I don't know the cause.

    – dirkt
    Oct 10 '18 at 19:56











  • @dirkt thanks, the syslogs I've added in the question are similar up to syslog.4 (few days before the problem). At this point I guess I'll wait and see if it happens again =)

    – Arjuna Del Toso
    Oct 18 '18 at 0:46
















0















I bought a Centurion Nano from the now defunct Alpha Computers, it ships with Alpha OS (that is essentially a tampered Ubuntu):



$ cat /etc/os-release
NAME="Alpha OS"
VERSION="1.0.0 Polaris"
ID="alpha-os"
ID_LIKE=ubuntu
PRETTY_NAME="Alpha OS 1.0.0 Polaris"
VERSION_ID="1.0.0"
HOME_URL="https://alpha.store/"
SUPPORT_URL="https://alpha.store/forums/forum/alpha-product-discussion/"
BUG_REPORT_URL="https://alpha.store/forums/forum/alpha-product-discussion/"
VERSION_CODENAME=polaris
UBUNTU_CODENAME=polaris
$ uname -a
Linux centurion 4.15.0-29-generic #31~16.04.1-Ubuntu SMP Wed Jul 18 08:54:04 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux


Today, after booting up, I noticed that my / mount was read-only, I rebooted and got this message:



Inodes that were part of a corrupted orphan linked list found.
UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.


on /dev/sdb2. Since this is the second time it happens in 1 month I'd like to understand what might be causing it and how do I make sure it doesn't happen again?
The first time I think the system hung up at shutdown and I powered it off. This time the shutdown was completed successfully (or so I thought).



Here more details about the drive:



dat@centurion:~$ sudo hdparm -I /dev/sdb

/dev/sdb:

ATA device, with non-removable media
Model Number: Lenovo SSD SL700 M.2 128G
Serial Number: B0E1077A19DD00000503
Firmware Revision: SBFM51.2
Transport: Serial, ATA8-AST, SATA 1.0a, SATA II Extensions, SATA Rev 2.5, SATA Rev 2.6, SATA Rev 3.0
Standards:
Supported: 11 10 9 8 7 6 5
Likely used: 11
Configuration:
Logical max current
cylinders 16383 16383
heads 16 16
sectors/track 63 63
--
CHS current addressable sectors: 16514064
LBA user addressable sectors: 250069680
LBA48 user addressable sectors: 250069680
Logical Sector size: 512 bytes
Physical Sector size: 512 bytes
Logical Sector-0 offset: 0 bytes
device size with M = 1024*1024: 122104 MBytes
device size with M = 1000*1000: 128035 MBytes (128 GB)
cache/buffer size = unknown
Form Factor: less than 1.8 inch
Nominal Media Rotation Rate: Solid State Device
Capabilities:
LBA, IORDY(can be disabled)
Queue depth: 32
Standby timer values: spec'd by Standard, no device specific minimum
R/W multiple sector transfer: Max = 16 Current = 16
DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5 *udma6
Cycle time: min=120ns recommended=120ns
PIO: pio0 pio1 pio2 pio3 pio4
Cycle time: no flow control=120ns IORDY flow control=120ns
Commands/features:
Enabled Supported:
* SMART feature set
Security Mode feature set
* Power Management feature set
* Write cache
* Look-ahead
* Host Protected Area feature set
* WRITE_BUFFER command
* READ_BUFFER command
* NOP cmd
* DOWNLOAD_MICROCODE
SET_MAX security extension
* 48-bit Address feature set
* Device Configuration Overlay feature set
* Mandatory FLUSH_CACHE
* FLUSH_CACHE_EXT
* SMART error logging
* SMART self-test
* General Purpose Logging feature set
* WRITE_{DMA|MULTIPLE}_FUA_EXT
* 64-bit World wide name
* WRITE_UNCORRECTABLE_EXT command
* {READ,WRITE}_DMA_EXT_GPL commands
* Segmented DOWNLOAD_MICROCODE
* Gen1 signaling speed (1.5Gb/s)
* Gen2 signaling speed (3.0Gb/s)
* Gen3 signaling speed (6.0Gb/s)
* Native Command Queueing (NCQ)
* Phy event counters
* READ_LOG_DMA_EXT equivalent to READ_LOG_EXT
* DMA Setup Auto-Activate optimization
Device-initiated interface power management
* Software settings preservation
* DOWNLOAD MICROCODE DMA command
* SET MAX SETPASSWORD/UNLOCK DMA commands
* WRITE BUFFER DMA command
* READ BUFFER DMA command
* DEVICE CONFIGURATION SET/IDENTIFY DMA commands
* Data Set Management TRIM supported (limit 8 blocks)
Security:
Master password revision code = 65534
supported
not enabled
not locked
frozen
not expired: security count
supported: enhanced erase
20min for SECURITY ERASE UNIT. 60min for ENHANCED SECURITY ERASE UNIT.
Logical Unit WWN Device Identifier: 0000000000000000
NAA : 0
IEEE OUI : 000000
Unique ID : 000000000
Checksum: correct


Partition mounted as ext4



dat@centurion:~$ blkid /dev/sdb2 
/dev/sdb2: UUID="3fd4075e-6d86-4535-9db6-f78b29f942e8" TYPE="ext4" PARTUUID="b4da84e6-2d39-4a40-b732-581a79ae72af"
dat@centurion:~$ cat /etc/mtab | grep sdb2
/dev/sdb2 / ext4 rw,relatime,errors=remount-ro,data=ordered 0 0


with an ecrypted home directory



dat@centurion:~$ cat /etc/mtab | grep home
/home/dat/.Private /home/dat ecryptfs rw,nosuid,nodev,relatime,ecryptfs_fnek_sig=sumtin,ecryptfs_sig=sumtinelse,ecryptfs_cipher=aes,ecryptfs_key_bytes=16,ecryptfs_unlink_sigs 0 0


And here the details of the recovery process



full recovery process



SMART (and non SMART) values:



dat@centurion:~$ sudo smartctl -x /dev/sdb
smartctl 6.5 2016-01-24 r4214 [x86_64-linux-4.15.0-29-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model: Lenovo SSD SL700 M.2 128G
Serial Number: B0E1077A19DD00000503
LU WWN Device Id: 0 000000 000000000
Firmware Version: SBFM51.2
User Capacity: 128,035,676,160 bytes [128 GB]
Sector Size: 512 bytes logical/physical
Rotation Rate: Solid State Device
Form Factor: < 1.8 inches
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: Unknown(0x0ff8) (minor revision not indicated)
SATA Version is: SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Wed Oct 10 11:58:55 2018 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
AAM feature is: Unavailable
APM feature is: Unavailable
Rd look-ahead is: Enabled
Write cache is: Enabled
ATA Security is: Disabled, frozen [SEC2]
Wt Cache Reorder: Unavailable

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (65535) seconds.
Offline data collection
capabilities: (0x79) SMART execute Offline immediate.
No Auto Offline data collection support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 30) minutes.
Conveyance self-test routine
recommended polling time: ( 6) minutes.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE
1 Raw_Read_Error_Rate PO-R-- 100 100 050 - 0
9 Power_On_Hours -O--C- 100 100 000 - 2404
12 Power_Cycle_Count -O--C- 100 100 000 - 283
168 Unknown_Attribute -O--C- 100 100 000 - 0
170 Unknown_Attribute PO---- 094 094 010 - 76
173 Unknown_Attribute -O--C- 100 100 000 - 1769532
192 Power-Off_Retract_Count -O--C- 100 100 000 - 36
194 Temperature_Celsius PO---K 067 067 000 - 33 (Min/Max 33/33)
218 Unknown_Attribute PO-R-- 100 100 050 - 0
231 Temperature_Celsius PO--C- 100 100 000 - 97
241 Total_LBAs_Written -O--C- 100 100 000 - 1901
||||||_ K auto-keep
|||||__ C event count
||||___ R error rate
|||____ S speed/performance
||_____ O updated online
|______ P prefailure warning

General Purpose Log Directory Version 1
SMART Log Directory Version 1 [multi-sector log support]
Address Access R/W Size Description
0x00 GPL,SL R/O 1 Log Directory
0x01 SL R/O 1 Summary SMART error log
0x02 SL R/O 51 Comprehensive SMART error log
0x03 GPL R/O 64 Ext. Comprehensive SMART error log
0x04 GPL,SL R/O 8 Device Statistics log
0x06 SL R/O 1 SMART self-test log
0x07 GPL R/O 1 Extended self-test log
0x09 SL R/W 1 Selective self-test log
0x10 GPL R/O 1 SATA NCQ Queued Error log
0x11 GPL R/O 1 SATA Phy Event Counters log
0x30 GPL,SL R/O 9 IDENTIFY DEVICE data log
0x80-0x9f GPL,SL R/W 16 Host vendor specific log

SMART Extended Comprehensive Error Log Version: 1 (64 sectors)
No Errors Logged

SMART Extended Self-test Log Version: 1 (1 sectors)
No self-tests have been logged. [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 0
Note: revision number not 1 implies that no selective self-test has ever been run
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

SCT Commands not supported

Device Statistics (GP Log 0x04)
Page Offset Size Value Flags Description
0x01 ===== = = === == General Statistics (rev 1) ==
0x01 0x008 4 283 --- Lifetime Power-On Resets
0x01 0x010 4 2404 --- Power-on Hours
0x01 0x018 6 3987986978 --- Logical Sectors Written
0x01 0x028 6 1577724785 --- Logical Sectors Read
0x04 ===== = = === == General Errors Statistics (rev 1) ==
0x04 0x008 4 0 --- Number of Reported Uncorrectable Errors
0x05 ===== = = === == Temperature Statistics (rev 1) ==
0x05 0x008 1 33 --- Current Temperature
0x05 0x020 1 33 --- Highest Temperature
0x05 0x028 1 33 --- Lowest Temperature
0x06 ===== = = === == Transport Statistics (rev 1) ==
0x06 0x018 4 0 --- Number of Interface CRC Errors
0x07 ===== = = === == Solid State Device Statistics (rev 1) ==
0x07 0x008 1 2 --- Percentage Used Endurance Indicator
|||_ C monitored condition met
||__ D supports DSN
|___ N normalized value

SATA Phy Event Counters (GP Log 0x11)
ID Size Value Description
0x0001 2 0 Command failed due to ICRC error
0x0003 2 0 R_ERR response for device-to-host data FIS
0x0004 2 0 R_ERR response for host-to-device data FIS
0x0006 2 0 R_ERR response for device-to-host non-data FIS
0x0007 2 0 R_ERR response for host-to-device non-data FIS
0x0008 2 0 Device-to-host non-data FIS retries
0x0009 4 2 Transition from drive PhyRdy to drive PhyNRdy
0x000a 4 2 Device-to-host register FISes sent due to a COMRESET
0x000f 2 0 R_ERR response for host-to-device data FIS, CRC
0x0010 2 0 R_ERR response for host-to-device data FIS, non-CRC
0x0012 2 0 R_ERR response for host-to-device non-data FIS, CRC
0x0013 2 0 R_ERR response for host-to-device non-data FIS, non-CRC


In syslog I can see an entry for sdb2 remounted but I'm not sure how to interpret it, can't find anything else that looks relevant to me



Oct  9 10:21:38 centurion kernel: [    2.621017] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Oct 9 10:21:38 centurion kernel: [ 2.621040] ata3: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Oct 9 10:21:38 centurion kernel: [ 2.621064] ata2: SATA link down (SStatus 4 SControl 300)
Oct 9 10:21:38 centurion kernel: [ 2.621258] ata3.00: ATA-11: Lenovo SSD SL700 M.2 128G, SBFM51.2, max UDMA/133
Oct 9 10:21:38 centurion kernel: [ 2.621259] ata3.00: 250069680 sectors, multi 16: LBA48 NCQ (depth 31/32), AA
Oct 9 10:21:38 centurion kernel: [ 2.621479] ata3.00: configured for UDMA/133
Oct 9 10:21:38 centurion kernel: [ 2.621588] ata1.00: ATA-10: HGST HTS541010B7E610, 01.01A01, max UDMA/133
Oct 9 10:21:38 centurion kernel: [ 2.621589] ata1.00: 1953525168 sectors, multi 16: LBA48 NCQ (depth 31/32), AA
Oct 9 10:21:38 centurion kernel: [ 2.622197] ata1.00: configured for UDMA/133
Oct 9 10:21:38 centurion kernel: [ 2.622455] scsi 0:0:0:0: Direct-Access ATA HGST HTS541010B7 1A01 PQ: 0 ANSI: 5
Oct 9 10:21:38 centurion kernel: [ 2.622683] sd 0:0:0:0: [sda] 1953525168 512-byte logical blocks: (1.00 TB/932 GiB)
Oct 9 10:21:38 centurion kernel: [ 2.622684] sd 0:0:0:0: [sda] 4096-byte physical blocks
Oct 9 10:21:38 centurion kernel: [ 2.622692] sd 0:0:0:0: [sda] Write Protect is off
Oct 9 10:21:38 centurion kernel: [ 2.622693] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
Oct 9 10:21:38 centurion kernel: [ 2.622699] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Oct 9 10:21:38 centurion kernel: [ 2.622725] sd 0:0:0:0: Attached scsi generic sg0 type 0
Oct 9 10:21:38 centurion kernel: [ 2.622957] scsi 2:0:0:0: Direct-Access ATA Lenovo SSD SL700 51.2 PQ: 0 ANSI: 5
Oct 9 10:21:38 centurion kernel: [ 2.623168] sd 2:0:0:0: Attached scsi generic sg1 type 0
Oct 9 10:21:38 centurion kernel: [ 2.623280] sd 2:0:0:0: [sdb] 250069680 512-byte logical blocks: (128 GB/119 GiB)
Oct 9 10:21:38 centurion kernel: [ 2.623337] sd 2:0:0:0: [sdb] Write Protect is off
Oct 9 10:21:38 centurion kernel: [ 2.623338] sd 2:0:0:0: [sdb] Mode Sense: 00 3a 00 00
Oct 9 10:21:38 centurion kernel: [ 2.623379] sd 2:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Oct 9 10:21:38 centurion kernel: [ 2.641154] sda: sda1
Oct 9 10:21:38 centurion kernel: [ 2.641429] sd 0:0:0:0: [sda] Attached SCSI disk
Oct 9 10:21:38 centurion kernel: [ 2.655999] sdb: sdb1 sdb2 sdb3
Oct 9 10:21:38 centurion kernel: [ 2.657197] sd 2:0:0:0: [sdb] Attached SCSI disk
Oct 9 10:21:38 centurion kernel: [ 2.976451] clocksource: Switched to clocksource tsc
Oct 9 10:21:38 centurion kernel: [ 3.487633] Console: switching to colour frame buffer device 240x67
Oct 9 10:21:38 centurion kernel: [ 3.507287] i915 0000:00:02.0: fb0: inteldrmfb frame buffer device
Oct 9 10:21:38 centurion kernel: [ 3.547895] random: fast init done
Oct 9 10:21:38 centurion kernel: [ 3.634734] psmouse serio1: elantech: assuming hardware version 4 (with firmware version 0x361f00)
Oct 9 10:21:38 centurion kernel: [ 3.674405] psmouse serio1: elantech: Synaptics capabilities query result 0x00, 0x16, 0x0d.
Oct 9 10:21:38 centurion kernel: [ 3.740007] raid6: sse2x1 gen() 10059 MB/s
Oct 9 10:21:38 centurion kernel: [ 3.788005] raid6: sse2x1 xor() 6131 MB/s
Oct 9 10:21:38 centurion kernel: [ 3.808299] [drm] RC6 on
Oct 9 10:21:38 centurion kernel: [ 3.836004] raid6: sse2x2 gen() 12046 MB/s
Oct 9 10:21:38 centurion kernel: [ 3.884002] raid6: sse2x2 xor() 8275 MB/s
Oct 9 10:21:38 centurion kernel: [ 3.932005] raid6: sse2x4 gen() 13873 MB/s
Oct 9 10:21:38 centurion kernel: [ 3.980004] raid6: sse2x4 xor() 9533 MB/s
Oct 9 10:21:38 centurion kernel: [ 4.028005] raid6: avx2x1 gen() 23736 MB/s
Oct 9 10:21:38 centurion kernel: [ 4.076004] raid6: avx2x1 xor() 17173 MB/s
Oct 9 10:21:38 centurion kernel: [ 4.124002] raid6: avx2x2 gen() 27103 MB/s
Oct 9 10:21:38 centurion kernel: [ 4.172003] raid6: avx2x2 xor() 18831 MB/s
Oct 9 10:21:38 centurion kernel: [ 4.220003] raid6: avx2x4 gen() 30098 MB/s
Oct 9 10:21:38 centurion kernel: [ 4.268004] raid6: avx2x4 xor() 22359 MB/s
Oct 9 10:21:38 centurion kernel: [ 4.268701] raid6: using algorithm avx2x4 gen() 30098 MB/s
Oct 9 10:21:38 centurion kernel: [ 4.269390] raid6: .... xor() 22359 MB/s, rmw enabled
Oct 9 10:21:38 centurion kernel: [ 4.270077] raid6: using avx2x2 recovery algorithm
Oct 9 10:21:38 centurion kernel: [ 4.270769] psmouse serio1: elantech: Elan sample query result 00, 49, 75
Oct 9 10:21:38 centurion kernel: [ 4.273643] xor: automatically using best checksumming function avx
Oct 9 10:21:38 centurion kernel: [ 4.284699] Btrfs loaded, crc32c=crc32c-intel
Oct 9 10:21:38 centurion kernel: [ 4.506433] input: ETPS/2 Elantech Touchpad as /devices/platform/i8042/serio1/input/input6
Oct 9 10:21:38 centurion kernel: [ 9.433983] EXT4-fs (sdb2): mounted filesystem with ordered data mode. Opts: (null)
Oct 9 10:21:38 centurion kernel: [ 10.700673] Lockdown: /dev/mem,kmem,port is restricted; see man kernel_lockdown.7
Oct 9 10:21:38 centurion kernel: [ 12.663600] lp: driver loaded but no devices found
Oct 9 10:21:38 centurion kernel: [ 12.790174] ppdev: user-space parallel port driver
Oct 9 10:21:38 centurion kernel: [ 15.800260] EXT4-fs (sdb2): re-mounted. Opts: errors=remount-ro




UPDATE



This is still happening, I've taken the following random measures in the hope of fixing the situation but to no avail:




  • removed the encrypted user home (now the entire disk is a plain ext4 with no ecryptfs)

  • removed encryption from swap and moved to a swapfile on a different disk

  • updated the kernel to 4.15.0.42.63 amd64 [from: 4.15.0.29.51]


I feel like the problem happens when the system is overloaded but it could easily be the other way around (errors -> read-only fs -> chrome and other apps feel slow).










share|improve this question




















  • 1





    It means the filesystem was left in an inconsistent state when you powered off, for example some blocks were/could not be written. Look in your syslog for error messages relating to the disk. What is on /dev/sdb2? If you have smartctl, get the SMART values.

    – dirkt
    Oct 10 '18 at 6:18











  • @dirkt I'm sure that in one instance I powered off the machine but I'm also sure that in the other instance a proper shutdown had happened. I've added details about sdb2 (it mounts the root fs "/"), SMART (what should I be looking for in there?), and syslog (I'm unsure if that "re-mounted" is normal or not, looks normal since I see it every day???).

    – Arjuna Del Toso
    Oct 10 '18 at 19:14






  • 1





    SMART attributes look good (100 is nominal, lower is worse). Don't look only at the newest syslog, look at older ones (you should have a logrotate). You want the error(s) that happened before it discovered the problem on reboot. If there are no errors (quite possible given the good SMART values), I don't know the cause.

    – dirkt
    Oct 10 '18 at 19:56











  • @dirkt thanks, the syslogs I've added in the question are similar up to syslog.4 (few days before the problem). At this point I guess I'll wait and see if it happens again =)

    – Arjuna Del Toso
    Oct 18 '18 at 0:46














0












0








0








I bought a Centurion Nano from the now defunct Alpha Computers, it ships with Alpha OS (that is essentially a tampered Ubuntu):



$ cat /etc/os-release
NAME="Alpha OS"
VERSION="1.0.0 Polaris"
ID="alpha-os"
ID_LIKE=ubuntu
PRETTY_NAME="Alpha OS 1.0.0 Polaris"
VERSION_ID="1.0.0"
HOME_URL="https://alpha.store/"
SUPPORT_URL="https://alpha.store/forums/forum/alpha-product-discussion/"
BUG_REPORT_URL="https://alpha.store/forums/forum/alpha-product-discussion/"
VERSION_CODENAME=polaris
UBUNTU_CODENAME=polaris
$ uname -a
Linux centurion 4.15.0-29-generic #31~16.04.1-Ubuntu SMP Wed Jul 18 08:54:04 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux


Today, after booting up, I noticed that my / mount was read-only, I rebooted and got this message:



Inodes that were part of a corrupted orphan linked list found.
UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.


on /dev/sdb2. Since this is the second time it happens in 1 month I'd like to understand what might be causing it and how do I make sure it doesn't happen again?
The first time I think the system hung up at shutdown and I powered it off. This time the shutdown was completed successfully (or so I thought).



Here more details about the drive:



dat@centurion:~$ sudo hdparm -I /dev/sdb

/dev/sdb:

ATA device, with non-removable media
Model Number: Lenovo SSD SL700 M.2 128G
Serial Number: B0E1077A19DD00000503
Firmware Revision: SBFM51.2
Transport: Serial, ATA8-AST, SATA 1.0a, SATA II Extensions, SATA Rev 2.5, SATA Rev 2.6, SATA Rev 3.0
Standards:
Supported: 11 10 9 8 7 6 5
Likely used: 11
Configuration:
Logical max current
cylinders 16383 16383
heads 16 16
sectors/track 63 63
--
CHS current addressable sectors: 16514064
LBA user addressable sectors: 250069680
LBA48 user addressable sectors: 250069680
Logical Sector size: 512 bytes
Physical Sector size: 512 bytes
Logical Sector-0 offset: 0 bytes
device size with M = 1024*1024: 122104 MBytes
device size with M = 1000*1000: 128035 MBytes (128 GB)
cache/buffer size = unknown
Form Factor: less than 1.8 inch
Nominal Media Rotation Rate: Solid State Device
Capabilities:
LBA, IORDY(can be disabled)
Queue depth: 32
Standby timer values: spec'd by Standard, no device specific minimum
R/W multiple sector transfer: Max = 16 Current = 16
DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5 *udma6
Cycle time: min=120ns recommended=120ns
PIO: pio0 pio1 pio2 pio3 pio4
Cycle time: no flow control=120ns IORDY flow control=120ns
Commands/features:
Enabled Supported:
* SMART feature set
Security Mode feature set
* Power Management feature set
* Write cache
* Look-ahead
* Host Protected Area feature set
* WRITE_BUFFER command
* READ_BUFFER command
* NOP cmd
* DOWNLOAD_MICROCODE
SET_MAX security extension
* 48-bit Address feature set
* Device Configuration Overlay feature set
* Mandatory FLUSH_CACHE
* FLUSH_CACHE_EXT
* SMART error logging
* SMART self-test
* General Purpose Logging feature set
* WRITE_{DMA|MULTIPLE}_FUA_EXT
* 64-bit World wide name
* WRITE_UNCORRECTABLE_EXT command
* {READ,WRITE}_DMA_EXT_GPL commands
* Segmented DOWNLOAD_MICROCODE
* Gen1 signaling speed (1.5Gb/s)
* Gen2 signaling speed (3.0Gb/s)
* Gen3 signaling speed (6.0Gb/s)
* Native Command Queueing (NCQ)
* Phy event counters
* READ_LOG_DMA_EXT equivalent to READ_LOG_EXT
* DMA Setup Auto-Activate optimization
Device-initiated interface power management
* Software settings preservation
* DOWNLOAD MICROCODE DMA command
* SET MAX SETPASSWORD/UNLOCK DMA commands
* WRITE BUFFER DMA command
* READ BUFFER DMA command
* DEVICE CONFIGURATION SET/IDENTIFY DMA commands
* Data Set Management TRIM supported (limit 8 blocks)
Security:
Master password revision code = 65534
supported
not enabled
not locked
frozen
not expired: security count
supported: enhanced erase
20min for SECURITY ERASE UNIT. 60min for ENHANCED SECURITY ERASE UNIT.
Logical Unit WWN Device Identifier: 0000000000000000
NAA : 0
IEEE OUI : 000000
Unique ID : 000000000
Checksum: correct


Partition mounted as ext4



dat@centurion:~$ blkid /dev/sdb2 
/dev/sdb2: UUID="3fd4075e-6d86-4535-9db6-f78b29f942e8" TYPE="ext4" PARTUUID="b4da84e6-2d39-4a40-b732-581a79ae72af"
dat@centurion:~$ cat /etc/mtab | grep sdb2
/dev/sdb2 / ext4 rw,relatime,errors=remount-ro,data=ordered 0 0


with an ecrypted home directory



dat@centurion:~$ cat /etc/mtab | grep home
/home/dat/.Private /home/dat ecryptfs rw,nosuid,nodev,relatime,ecryptfs_fnek_sig=sumtin,ecryptfs_sig=sumtinelse,ecryptfs_cipher=aes,ecryptfs_key_bytes=16,ecryptfs_unlink_sigs 0 0


And here the details of the recovery process



full recovery process



SMART (and non SMART) values:



dat@centurion:~$ sudo smartctl -x /dev/sdb
smartctl 6.5 2016-01-24 r4214 [x86_64-linux-4.15.0-29-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model: Lenovo SSD SL700 M.2 128G
Serial Number: B0E1077A19DD00000503
LU WWN Device Id: 0 000000 000000000
Firmware Version: SBFM51.2
User Capacity: 128,035,676,160 bytes [128 GB]
Sector Size: 512 bytes logical/physical
Rotation Rate: Solid State Device
Form Factor: < 1.8 inches
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: Unknown(0x0ff8) (minor revision not indicated)
SATA Version is: SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Wed Oct 10 11:58:55 2018 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
AAM feature is: Unavailable
APM feature is: Unavailable
Rd look-ahead is: Enabled
Write cache is: Enabled
ATA Security is: Disabled, frozen [SEC2]
Wt Cache Reorder: Unavailable

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (65535) seconds.
Offline data collection
capabilities: (0x79) SMART execute Offline immediate.
No Auto Offline data collection support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 30) minutes.
Conveyance self-test routine
recommended polling time: ( 6) minutes.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE
1 Raw_Read_Error_Rate PO-R-- 100 100 050 - 0
9 Power_On_Hours -O--C- 100 100 000 - 2404
12 Power_Cycle_Count -O--C- 100 100 000 - 283
168 Unknown_Attribute -O--C- 100 100 000 - 0
170 Unknown_Attribute PO---- 094 094 010 - 76
173 Unknown_Attribute -O--C- 100 100 000 - 1769532
192 Power-Off_Retract_Count -O--C- 100 100 000 - 36
194 Temperature_Celsius PO---K 067 067 000 - 33 (Min/Max 33/33)
218 Unknown_Attribute PO-R-- 100 100 050 - 0
231 Temperature_Celsius PO--C- 100 100 000 - 97
241 Total_LBAs_Written -O--C- 100 100 000 - 1901
||||||_ K auto-keep
|||||__ C event count
||||___ R error rate
|||____ S speed/performance
||_____ O updated online
|______ P prefailure warning

General Purpose Log Directory Version 1
SMART Log Directory Version 1 [multi-sector log support]
Address Access R/W Size Description
0x00 GPL,SL R/O 1 Log Directory
0x01 SL R/O 1 Summary SMART error log
0x02 SL R/O 51 Comprehensive SMART error log
0x03 GPL R/O 64 Ext. Comprehensive SMART error log
0x04 GPL,SL R/O 8 Device Statistics log
0x06 SL R/O 1 SMART self-test log
0x07 GPL R/O 1 Extended self-test log
0x09 SL R/W 1 Selective self-test log
0x10 GPL R/O 1 SATA NCQ Queued Error log
0x11 GPL R/O 1 SATA Phy Event Counters log
0x30 GPL,SL R/O 9 IDENTIFY DEVICE data log
0x80-0x9f GPL,SL R/W 16 Host vendor specific log

SMART Extended Comprehensive Error Log Version: 1 (64 sectors)
No Errors Logged

SMART Extended Self-test Log Version: 1 (1 sectors)
No self-tests have been logged. [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 0
Note: revision number not 1 implies that no selective self-test has ever been run
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

SCT Commands not supported

Device Statistics (GP Log 0x04)
Page Offset Size Value Flags Description
0x01 ===== = = === == General Statistics (rev 1) ==
0x01 0x008 4 283 --- Lifetime Power-On Resets
0x01 0x010 4 2404 --- Power-on Hours
0x01 0x018 6 3987986978 --- Logical Sectors Written
0x01 0x028 6 1577724785 --- Logical Sectors Read
0x04 ===== = = === == General Errors Statistics (rev 1) ==
0x04 0x008 4 0 --- Number of Reported Uncorrectable Errors
0x05 ===== = = === == Temperature Statistics (rev 1) ==
0x05 0x008 1 33 --- Current Temperature
0x05 0x020 1 33 --- Highest Temperature
0x05 0x028 1 33 --- Lowest Temperature
0x06 ===== = = === == Transport Statistics (rev 1) ==
0x06 0x018 4 0 --- Number of Interface CRC Errors
0x07 ===== = = === == Solid State Device Statistics (rev 1) ==
0x07 0x008 1 2 --- Percentage Used Endurance Indicator
|||_ C monitored condition met
||__ D supports DSN
|___ N normalized value

SATA Phy Event Counters (GP Log 0x11)
ID Size Value Description
0x0001 2 0 Command failed due to ICRC error
0x0003 2 0 R_ERR response for device-to-host data FIS
0x0004 2 0 R_ERR response for host-to-device data FIS
0x0006 2 0 R_ERR response for device-to-host non-data FIS
0x0007 2 0 R_ERR response for host-to-device non-data FIS
0x0008 2 0 Device-to-host non-data FIS retries
0x0009 4 2 Transition from drive PhyRdy to drive PhyNRdy
0x000a 4 2 Device-to-host register FISes sent due to a COMRESET
0x000f 2 0 R_ERR response for host-to-device data FIS, CRC
0x0010 2 0 R_ERR response for host-to-device data FIS, non-CRC
0x0012 2 0 R_ERR response for host-to-device non-data FIS, CRC
0x0013 2 0 R_ERR response for host-to-device non-data FIS, non-CRC


In syslog I can see an entry for sdb2 remounted but I'm not sure how to interpret it, can't find anything else that looks relevant to me



Oct  9 10:21:38 centurion kernel: [    2.621017] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Oct 9 10:21:38 centurion kernel: [ 2.621040] ata3: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Oct 9 10:21:38 centurion kernel: [ 2.621064] ata2: SATA link down (SStatus 4 SControl 300)
Oct 9 10:21:38 centurion kernel: [ 2.621258] ata3.00: ATA-11: Lenovo SSD SL700 M.2 128G, SBFM51.2, max UDMA/133
Oct 9 10:21:38 centurion kernel: [ 2.621259] ata3.00: 250069680 sectors, multi 16: LBA48 NCQ (depth 31/32), AA
Oct 9 10:21:38 centurion kernel: [ 2.621479] ata3.00: configured for UDMA/133
Oct 9 10:21:38 centurion kernel: [ 2.621588] ata1.00: ATA-10: HGST HTS541010B7E610, 01.01A01, max UDMA/133
Oct 9 10:21:38 centurion kernel: [ 2.621589] ata1.00: 1953525168 sectors, multi 16: LBA48 NCQ (depth 31/32), AA
Oct 9 10:21:38 centurion kernel: [ 2.622197] ata1.00: configured for UDMA/133
Oct 9 10:21:38 centurion kernel: [ 2.622455] scsi 0:0:0:0: Direct-Access ATA HGST HTS541010B7 1A01 PQ: 0 ANSI: 5
Oct 9 10:21:38 centurion kernel: [ 2.622683] sd 0:0:0:0: [sda] 1953525168 512-byte logical blocks: (1.00 TB/932 GiB)
Oct 9 10:21:38 centurion kernel: [ 2.622684] sd 0:0:0:0: [sda] 4096-byte physical blocks
Oct 9 10:21:38 centurion kernel: [ 2.622692] sd 0:0:0:0: [sda] Write Protect is off
Oct 9 10:21:38 centurion kernel: [ 2.622693] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
Oct 9 10:21:38 centurion kernel: [ 2.622699] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Oct 9 10:21:38 centurion kernel: [ 2.622725] sd 0:0:0:0: Attached scsi generic sg0 type 0
Oct 9 10:21:38 centurion kernel: [ 2.622957] scsi 2:0:0:0: Direct-Access ATA Lenovo SSD SL700 51.2 PQ: 0 ANSI: 5
Oct 9 10:21:38 centurion kernel: [ 2.623168] sd 2:0:0:0: Attached scsi generic sg1 type 0
Oct 9 10:21:38 centurion kernel: [ 2.623280] sd 2:0:0:0: [sdb] 250069680 512-byte logical blocks: (128 GB/119 GiB)
Oct 9 10:21:38 centurion kernel: [ 2.623337] sd 2:0:0:0: [sdb] Write Protect is off
Oct 9 10:21:38 centurion kernel: [ 2.623338] sd 2:0:0:0: [sdb] Mode Sense: 00 3a 00 00
Oct 9 10:21:38 centurion kernel: [ 2.623379] sd 2:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Oct 9 10:21:38 centurion kernel: [ 2.641154] sda: sda1
Oct 9 10:21:38 centurion kernel: [ 2.641429] sd 0:0:0:0: [sda] Attached SCSI disk
Oct 9 10:21:38 centurion kernel: [ 2.655999] sdb: sdb1 sdb2 sdb3
Oct 9 10:21:38 centurion kernel: [ 2.657197] sd 2:0:0:0: [sdb] Attached SCSI disk
Oct 9 10:21:38 centurion kernel: [ 2.976451] clocksource: Switched to clocksource tsc
Oct 9 10:21:38 centurion kernel: [ 3.487633] Console: switching to colour frame buffer device 240x67
Oct 9 10:21:38 centurion kernel: [ 3.507287] i915 0000:00:02.0: fb0: inteldrmfb frame buffer device
Oct 9 10:21:38 centurion kernel: [ 3.547895] random: fast init done
Oct 9 10:21:38 centurion kernel: [ 3.634734] psmouse serio1: elantech: assuming hardware version 4 (with firmware version 0x361f00)
Oct 9 10:21:38 centurion kernel: [ 3.674405] psmouse serio1: elantech: Synaptics capabilities query result 0x00, 0x16, 0x0d.
Oct 9 10:21:38 centurion kernel: [ 3.740007] raid6: sse2x1 gen() 10059 MB/s
Oct 9 10:21:38 centurion kernel: [ 3.788005] raid6: sse2x1 xor() 6131 MB/s
Oct 9 10:21:38 centurion kernel: [ 3.808299] [drm] RC6 on
Oct 9 10:21:38 centurion kernel: [ 3.836004] raid6: sse2x2 gen() 12046 MB/s
Oct 9 10:21:38 centurion kernel: [ 3.884002] raid6: sse2x2 xor() 8275 MB/s
Oct 9 10:21:38 centurion kernel: [ 3.932005] raid6: sse2x4 gen() 13873 MB/s
Oct 9 10:21:38 centurion kernel: [ 3.980004] raid6: sse2x4 xor() 9533 MB/s
Oct 9 10:21:38 centurion kernel: [ 4.028005] raid6: avx2x1 gen() 23736 MB/s
Oct 9 10:21:38 centurion kernel: [ 4.076004] raid6: avx2x1 xor() 17173 MB/s
Oct 9 10:21:38 centurion kernel: [ 4.124002] raid6: avx2x2 gen() 27103 MB/s
Oct 9 10:21:38 centurion kernel: [ 4.172003] raid6: avx2x2 xor() 18831 MB/s
Oct 9 10:21:38 centurion kernel: [ 4.220003] raid6: avx2x4 gen() 30098 MB/s
Oct 9 10:21:38 centurion kernel: [ 4.268004] raid6: avx2x4 xor() 22359 MB/s
Oct 9 10:21:38 centurion kernel: [ 4.268701] raid6: using algorithm avx2x4 gen() 30098 MB/s
Oct 9 10:21:38 centurion kernel: [ 4.269390] raid6: .... xor() 22359 MB/s, rmw enabled
Oct 9 10:21:38 centurion kernel: [ 4.270077] raid6: using avx2x2 recovery algorithm
Oct 9 10:21:38 centurion kernel: [ 4.270769] psmouse serio1: elantech: Elan sample query result 00, 49, 75
Oct 9 10:21:38 centurion kernel: [ 4.273643] xor: automatically using best checksumming function avx
Oct 9 10:21:38 centurion kernel: [ 4.284699] Btrfs loaded, crc32c=crc32c-intel
Oct 9 10:21:38 centurion kernel: [ 4.506433] input: ETPS/2 Elantech Touchpad as /devices/platform/i8042/serio1/input/input6
Oct 9 10:21:38 centurion kernel: [ 9.433983] EXT4-fs (sdb2): mounted filesystem with ordered data mode. Opts: (null)
Oct 9 10:21:38 centurion kernel: [ 10.700673] Lockdown: /dev/mem,kmem,port is restricted; see man kernel_lockdown.7
Oct 9 10:21:38 centurion kernel: [ 12.663600] lp: driver loaded but no devices found
Oct 9 10:21:38 centurion kernel: [ 12.790174] ppdev: user-space parallel port driver
Oct 9 10:21:38 centurion kernel: [ 15.800260] EXT4-fs (sdb2): re-mounted. Opts: errors=remount-ro




UPDATE



This is still happening, I've taken the following random measures in the hope of fixing the situation but to no avail:




  • removed the encrypted user home (now the entire disk is a plain ext4 with no ecryptfs)

  • removed encryption from swap and moved to a swapfile on a different disk

  • updated the kernel to 4.15.0.42.63 amd64 [from: 4.15.0.29.51]


I feel like the problem happens when the system is overloaded but it could easily be the other way around (errors -> read-only fs -> chrome and other apps feel slow).










share|improve this question
















I bought a Centurion Nano from the now defunct Alpha Computers, it ships with Alpha OS (that is essentially a tampered Ubuntu):



$ cat /etc/os-release
NAME="Alpha OS"
VERSION="1.0.0 Polaris"
ID="alpha-os"
ID_LIKE=ubuntu
PRETTY_NAME="Alpha OS 1.0.0 Polaris"
VERSION_ID="1.0.0"
HOME_URL="https://alpha.store/"
SUPPORT_URL="https://alpha.store/forums/forum/alpha-product-discussion/"
BUG_REPORT_URL="https://alpha.store/forums/forum/alpha-product-discussion/"
VERSION_CODENAME=polaris
UBUNTU_CODENAME=polaris
$ uname -a
Linux centurion 4.15.0-29-generic #31~16.04.1-Ubuntu SMP Wed Jul 18 08:54:04 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux


Today, after booting up, I noticed that my / mount was read-only, I rebooted and got this message:



Inodes that were part of a corrupted orphan linked list found.
UNEXPECTED INCONSISTENCY; RUN fsck MANUALLY.


on /dev/sdb2. Since this is the second time it happens in 1 month I'd like to understand what might be causing it and how do I make sure it doesn't happen again?
The first time I think the system hung up at shutdown and I powered it off. This time the shutdown was completed successfully (or so I thought).



Here more details about the drive:



dat@centurion:~$ sudo hdparm -I /dev/sdb

/dev/sdb:

ATA device, with non-removable media
Model Number: Lenovo SSD SL700 M.2 128G
Serial Number: B0E1077A19DD00000503
Firmware Revision: SBFM51.2
Transport: Serial, ATA8-AST, SATA 1.0a, SATA II Extensions, SATA Rev 2.5, SATA Rev 2.6, SATA Rev 3.0
Standards:
Supported: 11 10 9 8 7 6 5
Likely used: 11
Configuration:
Logical max current
cylinders 16383 16383
heads 16 16
sectors/track 63 63
--
CHS current addressable sectors: 16514064
LBA user addressable sectors: 250069680
LBA48 user addressable sectors: 250069680
Logical Sector size: 512 bytes
Physical Sector size: 512 bytes
Logical Sector-0 offset: 0 bytes
device size with M = 1024*1024: 122104 MBytes
device size with M = 1000*1000: 128035 MBytes (128 GB)
cache/buffer size = unknown
Form Factor: less than 1.8 inch
Nominal Media Rotation Rate: Solid State Device
Capabilities:
LBA, IORDY(can be disabled)
Queue depth: 32
Standby timer values: spec'd by Standard, no device specific minimum
R/W multiple sector transfer: Max = 16 Current = 16
DMA: mdma0 mdma1 mdma2 udma0 udma1 udma2 udma3 udma4 udma5 *udma6
Cycle time: min=120ns recommended=120ns
PIO: pio0 pio1 pio2 pio3 pio4
Cycle time: no flow control=120ns IORDY flow control=120ns
Commands/features:
Enabled Supported:
* SMART feature set
Security Mode feature set
* Power Management feature set
* Write cache
* Look-ahead
* Host Protected Area feature set
* WRITE_BUFFER command
* READ_BUFFER command
* NOP cmd
* DOWNLOAD_MICROCODE
SET_MAX security extension
* 48-bit Address feature set
* Device Configuration Overlay feature set
* Mandatory FLUSH_CACHE
* FLUSH_CACHE_EXT
* SMART error logging
* SMART self-test
* General Purpose Logging feature set
* WRITE_{DMA|MULTIPLE}_FUA_EXT
* 64-bit World wide name
* WRITE_UNCORRECTABLE_EXT command
* {READ,WRITE}_DMA_EXT_GPL commands
* Segmented DOWNLOAD_MICROCODE
* Gen1 signaling speed (1.5Gb/s)
* Gen2 signaling speed (3.0Gb/s)
* Gen3 signaling speed (6.0Gb/s)
* Native Command Queueing (NCQ)
* Phy event counters
* READ_LOG_DMA_EXT equivalent to READ_LOG_EXT
* DMA Setup Auto-Activate optimization
Device-initiated interface power management
* Software settings preservation
* DOWNLOAD MICROCODE DMA command
* SET MAX SETPASSWORD/UNLOCK DMA commands
* WRITE BUFFER DMA command
* READ BUFFER DMA command
* DEVICE CONFIGURATION SET/IDENTIFY DMA commands
* Data Set Management TRIM supported (limit 8 blocks)
Security:
Master password revision code = 65534
supported
not enabled
not locked
frozen
not expired: security count
supported: enhanced erase
20min for SECURITY ERASE UNIT. 60min for ENHANCED SECURITY ERASE UNIT.
Logical Unit WWN Device Identifier: 0000000000000000
NAA : 0
IEEE OUI : 000000
Unique ID : 000000000
Checksum: correct


Partition mounted as ext4



dat@centurion:~$ blkid /dev/sdb2 
/dev/sdb2: UUID="3fd4075e-6d86-4535-9db6-f78b29f942e8" TYPE="ext4" PARTUUID="b4da84e6-2d39-4a40-b732-581a79ae72af"
dat@centurion:~$ cat /etc/mtab | grep sdb2
/dev/sdb2 / ext4 rw,relatime,errors=remount-ro,data=ordered 0 0


with an ecrypted home directory



dat@centurion:~$ cat /etc/mtab | grep home
/home/dat/.Private /home/dat ecryptfs rw,nosuid,nodev,relatime,ecryptfs_fnek_sig=sumtin,ecryptfs_sig=sumtinelse,ecryptfs_cipher=aes,ecryptfs_key_bytes=16,ecryptfs_unlink_sigs 0 0


And here the details of the recovery process



full recovery process



SMART (and non SMART) values:



dat@centurion:~$ sudo smartctl -x /dev/sdb
smartctl 6.5 2016-01-24 r4214 [x86_64-linux-4.15.0-29-generic] (local build)
Copyright (C) 2002-16, Bruce Allen, Christian Franke, www.smartmontools.org

=== START OF INFORMATION SECTION ===
Device Model: Lenovo SSD SL700 M.2 128G
Serial Number: B0E1077A19DD00000503
LU WWN Device Id: 0 000000 000000000
Firmware Version: SBFM51.2
User Capacity: 128,035,676,160 bytes [128 GB]
Sector Size: 512 bytes logical/physical
Rotation Rate: Solid State Device
Form Factor: < 1.8 inches
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: Unknown(0x0ff8) (minor revision not indicated)
SATA Version is: SATA 3.2, 6.0 Gb/s (current: 6.0 Gb/s)
Local Time is: Wed Oct 10 11:58:55 2018 PDT
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
AAM feature is: Unavailable
APM feature is: Unavailable
Rd look-ahead is: Enabled
Write cache is: Enabled
ATA Security is: Disabled, frozen [SEC2]
Wt Cache Reorder: Unavailable

=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED

General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: (65535) seconds.
Offline data collection
capabilities: (0x79) SMART execute Offline immediate.
No Auto Offline data collection support.
Suspend Offline collection upon new
command.
Offline surface scan supported.
Self-test supported.
Conveyance Self-test supported.
Selective Self-test supported.
SMART capabilities: (0x0003) Saves SMART data before entering
power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 30) minutes.
Conveyance self-test routine
recommended polling time: ( 6) minutes.

SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE
1 Raw_Read_Error_Rate PO-R-- 100 100 050 - 0
9 Power_On_Hours -O--C- 100 100 000 - 2404
12 Power_Cycle_Count -O--C- 100 100 000 - 283
168 Unknown_Attribute -O--C- 100 100 000 - 0
170 Unknown_Attribute PO---- 094 094 010 - 76
173 Unknown_Attribute -O--C- 100 100 000 - 1769532
192 Power-Off_Retract_Count -O--C- 100 100 000 - 36
194 Temperature_Celsius PO---K 067 067 000 - 33 (Min/Max 33/33)
218 Unknown_Attribute PO-R-- 100 100 050 - 0
231 Temperature_Celsius PO--C- 100 100 000 - 97
241 Total_LBAs_Written -O--C- 100 100 000 - 1901
||||||_ K auto-keep
|||||__ C event count
||||___ R error rate
|||____ S speed/performance
||_____ O updated online
|______ P prefailure warning

General Purpose Log Directory Version 1
SMART Log Directory Version 1 [multi-sector log support]
Address Access R/W Size Description
0x00 GPL,SL R/O 1 Log Directory
0x01 SL R/O 1 Summary SMART error log
0x02 SL R/O 51 Comprehensive SMART error log
0x03 GPL R/O 64 Ext. Comprehensive SMART error log
0x04 GPL,SL R/O 8 Device Statistics log
0x06 SL R/O 1 SMART self-test log
0x07 GPL R/O 1 Extended self-test log
0x09 SL R/W 1 Selective self-test log
0x10 GPL R/O 1 SATA NCQ Queued Error log
0x11 GPL R/O 1 SATA Phy Event Counters log
0x30 GPL,SL R/O 9 IDENTIFY DEVICE data log
0x80-0x9f GPL,SL R/W 16 Host vendor specific log

SMART Extended Comprehensive Error Log Version: 1 (64 sectors)
No Errors Logged

SMART Extended Self-test Log Version: 1 (1 sectors)
No self-tests have been logged. [To run self-tests, use: smartctl -t]

SMART Selective self-test log data structure revision number 0
Note: revision number not 1 implies that no selective self-test has ever been run
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

SCT Commands not supported

Device Statistics (GP Log 0x04)
Page Offset Size Value Flags Description
0x01 ===== = = === == General Statistics (rev 1) ==
0x01 0x008 4 283 --- Lifetime Power-On Resets
0x01 0x010 4 2404 --- Power-on Hours
0x01 0x018 6 3987986978 --- Logical Sectors Written
0x01 0x028 6 1577724785 --- Logical Sectors Read
0x04 ===== = = === == General Errors Statistics (rev 1) ==
0x04 0x008 4 0 --- Number of Reported Uncorrectable Errors
0x05 ===== = = === == Temperature Statistics (rev 1) ==
0x05 0x008 1 33 --- Current Temperature
0x05 0x020 1 33 --- Highest Temperature
0x05 0x028 1 33 --- Lowest Temperature
0x06 ===== = = === == Transport Statistics (rev 1) ==
0x06 0x018 4 0 --- Number of Interface CRC Errors
0x07 ===== = = === == Solid State Device Statistics (rev 1) ==
0x07 0x008 1 2 --- Percentage Used Endurance Indicator
|||_ C monitored condition met
||__ D supports DSN
|___ N normalized value

SATA Phy Event Counters (GP Log 0x11)
ID Size Value Description
0x0001 2 0 Command failed due to ICRC error
0x0003 2 0 R_ERR response for device-to-host data FIS
0x0004 2 0 R_ERR response for host-to-device data FIS
0x0006 2 0 R_ERR response for device-to-host non-data FIS
0x0007 2 0 R_ERR response for host-to-device non-data FIS
0x0008 2 0 Device-to-host non-data FIS retries
0x0009 4 2 Transition from drive PhyRdy to drive PhyNRdy
0x000a 4 2 Device-to-host register FISes sent due to a COMRESET
0x000f 2 0 R_ERR response for host-to-device data FIS, CRC
0x0010 2 0 R_ERR response for host-to-device data FIS, non-CRC
0x0012 2 0 R_ERR response for host-to-device non-data FIS, CRC
0x0013 2 0 R_ERR response for host-to-device non-data FIS, non-CRC


In syslog I can see an entry for sdb2 remounted but I'm not sure how to interpret it, can't find anything else that looks relevant to me



Oct  9 10:21:38 centurion kernel: [    2.621017] ata1: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Oct 9 10:21:38 centurion kernel: [ 2.621040] ata3: SATA link up 6.0 Gbps (SStatus 133 SControl 300)
Oct 9 10:21:38 centurion kernel: [ 2.621064] ata2: SATA link down (SStatus 4 SControl 300)
Oct 9 10:21:38 centurion kernel: [ 2.621258] ata3.00: ATA-11: Lenovo SSD SL700 M.2 128G, SBFM51.2, max UDMA/133
Oct 9 10:21:38 centurion kernel: [ 2.621259] ata3.00: 250069680 sectors, multi 16: LBA48 NCQ (depth 31/32), AA
Oct 9 10:21:38 centurion kernel: [ 2.621479] ata3.00: configured for UDMA/133
Oct 9 10:21:38 centurion kernel: [ 2.621588] ata1.00: ATA-10: HGST HTS541010B7E610, 01.01A01, max UDMA/133
Oct 9 10:21:38 centurion kernel: [ 2.621589] ata1.00: 1953525168 sectors, multi 16: LBA48 NCQ (depth 31/32), AA
Oct 9 10:21:38 centurion kernel: [ 2.622197] ata1.00: configured for UDMA/133
Oct 9 10:21:38 centurion kernel: [ 2.622455] scsi 0:0:0:0: Direct-Access ATA HGST HTS541010B7 1A01 PQ: 0 ANSI: 5
Oct 9 10:21:38 centurion kernel: [ 2.622683] sd 0:0:0:0: [sda] 1953525168 512-byte logical blocks: (1.00 TB/932 GiB)
Oct 9 10:21:38 centurion kernel: [ 2.622684] sd 0:0:0:0: [sda] 4096-byte physical blocks
Oct 9 10:21:38 centurion kernel: [ 2.622692] sd 0:0:0:0: [sda] Write Protect is off
Oct 9 10:21:38 centurion kernel: [ 2.622693] sd 0:0:0:0: [sda] Mode Sense: 00 3a 00 00
Oct 9 10:21:38 centurion kernel: [ 2.622699] sd 0:0:0:0: [sda] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Oct 9 10:21:38 centurion kernel: [ 2.622725] sd 0:0:0:0: Attached scsi generic sg0 type 0
Oct 9 10:21:38 centurion kernel: [ 2.622957] scsi 2:0:0:0: Direct-Access ATA Lenovo SSD SL700 51.2 PQ: 0 ANSI: 5
Oct 9 10:21:38 centurion kernel: [ 2.623168] sd 2:0:0:0: Attached scsi generic sg1 type 0
Oct 9 10:21:38 centurion kernel: [ 2.623280] sd 2:0:0:0: [sdb] 250069680 512-byte logical blocks: (128 GB/119 GiB)
Oct 9 10:21:38 centurion kernel: [ 2.623337] sd 2:0:0:0: [sdb] Write Protect is off
Oct 9 10:21:38 centurion kernel: [ 2.623338] sd 2:0:0:0: [sdb] Mode Sense: 00 3a 00 00
Oct 9 10:21:38 centurion kernel: [ 2.623379] sd 2:0:0:0: [sdb] Write cache: enabled, read cache: enabled, doesn't support DPO or FUA
Oct 9 10:21:38 centurion kernel: [ 2.641154] sda: sda1
Oct 9 10:21:38 centurion kernel: [ 2.641429] sd 0:0:0:0: [sda] Attached SCSI disk
Oct 9 10:21:38 centurion kernel: [ 2.655999] sdb: sdb1 sdb2 sdb3
Oct 9 10:21:38 centurion kernel: [ 2.657197] sd 2:0:0:0: [sdb] Attached SCSI disk
Oct 9 10:21:38 centurion kernel: [ 2.976451] clocksource: Switched to clocksource tsc
Oct 9 10:21:38 centurion kernel: [ 3.487633] Console: switching to colour frame buffer device 240x67
Oct 9 10:21:38 centurion kernel: [ 3.507287] i915 0000:00:02.0: fb0: inteldrmfb frame buffer device
Oct 9 10:21:38 centurion kernel: [ 3.547895] random: fast init done
Oct 9 10:21:38 centurion kernel: [ 3.634734] psmouse serio1: elantech: assuming hardware version 4 (with firmware version 0x361f00)
Oct 9 10:21:38 centurion kernel: [ 3.674405] psmouse serio1: elantech: Synaptics capabilities query result 0x00, 0x16, 0x0d.
Oct 9 10:21:38 centurion kernel: [ 3.740007] raid6: sse2x1 gen() 10059 MB/s
Oct 9 10:21:38 centurion kernel: [ 3.788005] raid6: sse2x1 xor() 6131 MB/s
Oct 9 10:21:38 centurion kernel: [ 3.808299] [drm] RC6 on
Oct 9 10:21:38 centurion kernel: [ 3.836004] raid6: sse2x2 gen() 12046 MB/s
Oct 9 10:21:38 centurion kernel: [ 3.884002] raid6: sse2x2 xor() 8275 MB/s
Oct 9 10:21:38 centurion kernel: [ 3.932005] raid6: sse2x4 gen() 13873 MB/s
Oct 9 10:21:38 centurion kernel: [ 3.980004] raid6: sse2x4 xor() 9533 MB/s
Oct 9 10:21:38 centurion kernel: [ 4.028005] raid6: avx2x1 gen() 23736 MB/s
Oct 9 10:21:38 centurion kernel: [ 4.076004] raid6: avx2x1 xor() 17173 MB/s
Oct 9 10:21:38 centurion kernel: [ 4.124002] raid6: avx2x2 gen() 27103 MB/s
Oct 9 10:21:38 centurion kernel: [ 4.172003] raid6: avx2x2 xor() 18831 MB/s
Oct 9 10:21:38 centurion kernel: [ 4.220003] raid6: avx2x4 gen() 30098 MB/s
Oct 9 10:21:38 centurion kernel: [ 4.268004] raid6: avx2x4 xor() 22359 MB/s
Oct 9 10:21:38 centurion kernel: [ 4.268701] raid6: using algorithm avx2x4 gen() 30098 MB/s
Oct 9 10:21:38 centurion kernel: [ 4.269390] raid6: .... xor() 22359 MB/s, rmw enabled
Oct 9 10:21:38 centurion kernel: [ 4.270077] raid6: using avx2x2 recovery algorithm
Oct 9 10:21:38 centurion kernel: [ 4.270769] psmouse serio1: elantech: Elan sample query result 00, 49, 75
Oct 9 10:21:38 centurion kernel: [ 4.273643] xor: automatically using best checksumming function avx
Oct 9 10:21:38 centurion kernel: [ 4.284699] Btrfs loaded, crc32c=crc32c-intel
Oct 9 10:21:38 centurion kernel: [ 4.506433] input: ETPS/2 Elantech Touchpad as /devices/platform/i8042/serio1/input/input6
Oct 9 10:21:38 centurion kernel: [ 9.433983] EXT4-fs (sdb2): mounted filesystem with ordered data mode. Opts: (null)
Oct 9 10:21:38 centurion kernel: [ 10.700673] Lockdown: /dev/mem,kmem,port is restricted; see man kernel_lockdown.7
Oct 9 10:21:38 centurion kernel: [ 12.663600] lp: driver loaded but no devices found
Oct 9 10:21:38 centurion kernel: [ 12.790174] ppdev: user-space parallel port driver
Oct 9 10:21:38 centurion kernel: [ 15.800260] EXT4-fs (sdb2): re-mounted. Opts: errors=remount-ro




UPDATE



This is still happening, I've taken the following random measures in the hope of fixing the situation but to no avail:




  • removed the encrypted user home (now the entire disk is a plain ext4 with no ecryptfs)

  • removed encryption from swap and moved to a swapfile on a different disk

  • updated the kernel to 4.15.0.42.63 amd64 [from: 4.15.0.29.51]


I feel like the problem happens when the system is overloaded but it could easily be the other way around (errors -> read-only fs -> chrome and other apps feel slow).







linux ubuntu hard-drive ext4 inode






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Dec 18 '18 at 17:12







Arjuna Del Toso

















asked Oct 10 '18 at 3:32









Arjuna Del TosoArjuna Del Toso

1011




1011








  • 1





    It means the filesystem was left in an inconsistent state when you powered off, for example some blocks were/could not be written. Look in your syslog for error messages relating to the disk. What is on /dev/sdb2? If you have smartctl, get the SMART values.

    – dirkt
    Oct 10 '18 at 6:18











  • @dirkt I'm sure that in one instance I powered off the machine but I'm also sure that in the other instance a proper shutdown had happened. I've added details about sdb2 (it mounts the root fs "/"), SMART (what should I be looking for in there?), and syslog (I'm unsure if that "re-mounted" is normal or not, looks normal since I see it every day???).

    – Arjuna Del Toso
    Oct 10 '18 at 19:14






  • 1





    SMART attributes look good (100 is nominal, lower is worse). Don't look only at the newest syslog, look at older ones (you should have a logrotate). You want the error(s) that happened before it discovered the problem on reboot. If there are no errors (quite possible given the good SMART values), I don't know the cause.

    – dirkt
    Oct 10 '18 at 19:56











  • @dirkt thanks, the syslogs I've added in the question are similar up to syslog.4 (few days before the problem). At this point I guess I'll wait and see if it happens again =)

    – Arjuna Del Toso
    Oct 18 '18 at 0:46














  • 1





    It means the filesystem was left in an inconsistent state when you powered off, for example some blocks were/could not be written. Look in your syslog for error messages relating to the disk. What is on /dev/sdb2? If you have smartctl, get the SMART values.

    – dirkt
    Oct 10 '18 at 6:18











  • @dirkt I'm sure that in one instance I powered off the machine but I'm also sure that in the other instance a proper shutdown had happened. I've added details about sdb2 (it mounts the root fs "/"), SMART (what should I be looking for in there?), and syslog (I'm unsure if that "re-mounted" is normal or not, looks normal since I see it every day???).

    – Arjuna Del Toso
    Oct 10 '18 at 19:14






  • 1





    SMART attributes look good (100 is nominal, lower is worse). Don't look only at the newest syslog, look at older ones (you should have a logrotate). You want the error(s) that happened before it discovered the problem on reboot. If there are no errors (quite possible given the good SMART values), I don't know the cause.

    – dirkt
    Oct 10 '18 at 19:56











  • @dirkt thanks, the syslogs I've added in the question are similar up to syslog.4 (few days before the problem). At this point I guess I'll wait and see if it happens again =)

    – Arjuna Del Toso
    Oct 18 '18 at 0:46








1




1





It means the filesystem was left in an inconsistent state when you powered off, for example some blocks were/could not be written. Look in your syslog for error messages relating to the disk. What is on /dev/sdb2? If you have smartctl, get the SMART values.

– dirkt
Oct 10 '18 at 6:18





It means the filesystem was left in an inconsistent state when you powered off, for example some blocks were/could not be written. Look in your syslog for error messages relating to the disk. What is on /dev/sdb2? If you have smartctl, get the SMART values.

– dirkt
Oct 10 '18 at 6:18













@dirkt I'm sure that in one instance I powered off the machine but I'm also sure that in the other instance a proper shutdown had happened. I've added details about sdb2 (it mounts the root fs "/"), SMART (what should I be looking for in there?), and syslog (I'm unsure if that "re-mounted" is normal or not, looks normal since I see it every day???).

– Arjuna Del Toso
Oct 10 '18 at 19:14





@dirkt I'm sure that in one instance I powered off the machine but I'm also sure that in the other instance a proper shutdown had happened. I've added details about sdb2 (it mounts the root fs "/"), SMART (what should I be looking for in there?), and syslog (I'm unsure if that "re-mounted" is normal or not, looks normal since I see it every day???).

– Arjuna Del Toso
Oct 10 '18 at 19:14




1




1





SMART attributes look good (100 is nominal, lower is worse). Don't look only at the newest syslog, look at older ones (you should have a logrotate). You want the error(s) that happened before it discovered the problem on reboot. If there are no errors (quite possible given the good SMART values), I don't know the cause.

– dirkt
Oct 10 '18 at 19:56





SMART attributes look good (100 is nominal, lower is worse). Don't look only at the newest syslog, look at older ones (you should have a logrotate). You want the error(s) that happened before it discovered the problem on reboot. If there are no errors (quite possible given the good SMART values), I don't know the cause.

– dirkt
Oct 10 '18 at 19:56













@dirkt thanks, the syslogs I've added in the question are similar up to syslog.4 (few days before the problem). At this point I guess I'll wait and see if it happens again =)

– Arjuna Del Toso
Oct 18 '18 at 0:46





@dirkt thanks, the syslogs I've added in the question are similar up to syslog.4 (few days before the problem). At this point I guess I'll wait and see if it happens again =)

– Arjuna Del Toso
Oct 18 '18 at 0:46










0






active

oldest

votes











Your Answer








StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "3"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: true,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: 10,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsuperuser.com%2fquestions%2f1365381%2finodes-that-were-part-of-a-corrupted-orphan-linked-list-found-how-to-understand%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























0






active

oldest

votes








0






active

oldest

votes









active

oldest

votes






active

oldest

votes
















draft saved

draft discarded




















































Thanks for contributing an answer to Super User!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fsuperuser.com%2fquestions%2f1365381%2finodes-that-were-part-of-a-corrupted-orphan-linked-list-found-how-to-understand%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Plaza Victoria

In PowerPoint, is there a keyboard shortcut for bulleted / numbered list?

How to put 3 figures in Latex with 2 figures side by side and 1 below these side by side images but in...