sprezzatech blog #0017

Dirty South Supercomputers and Waffles
UNIX, HPC, cyberwarfare, and chasing perf in Atlanta.

So many problems, Chef.
Tue, 02 Apr 2013 03:40:36 -0400

so very many problems.

ughhhgh, there i was happily hacking away last evening, when suddenly everything goes to hell. advancing a line in Konsole yields a popup notification that it couldn't write to /tmp. running anything but trivial commands hits immediate read errors. those which can be run inevitably hit similar errors processing their arguments. it's just as if a network drive dropped out from under you, or you unplugged an external drive without properly unmounting it. well, indeed, intel had unplugged one of my 128GB 320 SSDs for me, an apparently well-known bug which manifests itself as a serial number of BAD_CTX [hexval] and only 8MB of zeroes available on the block device. annoyingly, this was my primary device (/, /usr, /home, everything but active torrents, a few VMs, and bulk storage). that'd not have been much of a problem, but—most unfortunately, i realized as i sat up with a bolt—i'd pulled apart the mdadm software RAID1 which typically backed these data last week to stash aforementioned VMs. indeed, the dead SSD had three partitions on them, two of them degraded halves of my raid. yep, boom, with that i'd lost everything aside from media on my machine. at least a day lost to rebuilding it, i knew, if rebuilding was even possible.

well, i managed to rescue...most everything, i think, though i'm only finishing up now (around 0400 tuesday) after starting at 0700 monday morning, and working continuously. utterly fucking wasted day. a horrible time. i'm shocked and appalled by intel's shoddy work here, especially given the reliability premium i'd happily paid them. admittedly, the problem appears to be in the firmware as opposed to the NAND flash, and it's not as if intel makes the firmware/controller, but still, augh, the product ought not be on the market. this failure case is hard and fast, from all i can tell. i've got some new data, by the way, for anyone dealing with this issue (from the fora searches i've performed, there seem to be a great many of you): so i'm fairly certain my data is still there—likely not in any recognizable form, due to what are assumedly profoundly fucked internal mapping tables, but certainly able to be dd'd off the drive for tedious dredging—if i could just zorch the fucking HPA. it has nothing to do with T13 drive security settings when i refer to this, btw. so i'll keep this thing around for a bit and see if i can't pull my most recent data from it at some point. in the meantime, the next intel employee i see is losing exactly one tooth.

whilst reinstalling my box, i took a few quick notes on problems i noticed in sprezzos. ordered in order that i noticed them, allow me to reproduce this list. totally not daunting at all:

  1. ssmtp needs a rebuild (gnutls)
  2. pack net-ip-perl, io-socket-inet6-, digest-hmac-,authen-ntlm-, authen-sasl-
  3. pack guile-1.8 or move users
  4. rebuild nullmailer (gnutls)
  5. still not running gpm -- not even installing, actually
  6. when gpm is installed, warning about /usr/lib/gpm/gpm_has_mouse_control
  7. once installed, "systemctl start gpm" does indeed work
  8. ...though it still doesn't start at boot
  9. still not bringing up interfaces on boot
  10. writing allow-hotplug entries to /etc/network/interfaces twice
  11. need fix usb keyboards yesterday
  12. need run setupcon on boot
  13. pack surfraw
  14. pack iselect
  15. pack screenie
  16. pack libcache-perl, libclass-*-perl, libdata-*-perl
  17. pack libfeed-find-perl, libunicode-map8-perl, libaudio-scrobbler-perl
  18. i think we've lost our aptitude defaults
  19. should rsyslog really be running with systemd?
  20. colorize prompt by default
  21. really must move to physical naming for disks
  22. turn on syncookies by default -- lost sysctl settings?
  23. nvidia doesn't build out of the box against our shipped kernel
  24. need run detect-sensors and add discovered modules to /etc/modules
  25. ssh-agent is running through pam but need generate ssh key for added users
  26. pm-utils recommend obsolete cpufrequtils
  27. clutter-2.0-gst conflicts with clutter-1.0-gst
  28. compiz9 doesn't recommend/depend on plugin, backend, or settins manager
  29. udisks2 recommends obsolete cryptsetup-bin
  30. gconf2 APT hooks bitch about dbus when run from console
  31. default xdg directories are terrible (Templates? Videos? fuck you)
  32. need rebuild qdbus with epoch >= 4 or old one gets reinstalled over it
  33. konsole throws up a 'knotify crashed' dialog for invalid tab completes(!)
  34. top needs be colorized by default
  35. need solarized vim theme by default
  36. smarttools ought be installed by default
  37. smart doesn't run automatically once installed
  38. smart needs a semi-sensible config
  39. probably want a more useful default xinitrc than "xterm"
  40. why is notify-osd installed to /usr/lib/notify-osd where it's useful to no one?
  41. compiz8 also doesn't recommend backend (does get plugins)
  42. compiz-kde isn't installable
  43. nouveau 9.1.1 doesn't appear to work -- warning about can't open nouveau dri
  44. looks like we're missing a in /usr/lib/dri
  45. what the hell is agetty? use mingetty by default
  46. mpd ought start lastmp and lastfmsubmitd for me
  47. at the very least, lastmp ought start lastfmsubmitd or vice versa
  48. why the hell are we installing 3.7.2 spl/zfs
  49. need upgrade spl/zfs to 0.6.0
  50. wtf on login: "-bash: data/zeitgeist-daemon.bash_completion: NSFoDirectory"
  51. set up wireless interfaces as wpa-roam/manual with an example .conf
  52. our python-libxml2 doesn't get discovered by autotools
  53. raptorial needs depend on apt-file unless it wants to download contents files
  54. investigate this "too big file to journald" crap (maybe systemd-coredump?)
  55. nautilus appears to be missing most of its icons
  56. gnome/gtk are both horrifically ugly out of the box (fonts are not so bad!)
  57. gnome-session fails (use --debug to get more info)
  58. why does udisksd burst as much cpu as mdraid_resync+mdraid_raid6 (no gui used!)
  59. growlight's corners in non-fbterm console are abominably ugly
  60. holy fucking shit cert is only good for we use no www augh
  61. editing profile preferences crashes gnome-terminal
  62. when you go back an entry in growlight, if the previous entry was not on the
  63. default set of entries listed, you get that set and nothing highlighted
  64. growlight crashes on exit sometimes
  65. growlight crashes if 'h' is pressed while blocked on a slow disk during init
  66. ccsm doesn't have any icons or text (patch at works)
  67. lightdm doesn't start (/usr/share/xgreeters/default.desktop missing)

fuck me gently with a chainsaw!

the one positive thing is that the two 256GB SSDs i ordered to eliminate Intel 320 SSDs from my life forevermore are by....Plextor! nostalgia for my 17 year old hella warez-burning self overwhelms me :D.

