Degraded availability for most services (Updated 2020-02-18)
Visit www.ludd.ltu.se for our website
kerberos/ldap auth |
running. OK
|
backup systems |
backing up. OK
disk on backup server somewhat full |
dns servers |
naming things. OK
|
git.ludd.ltu.se |
degraded
|
ircshell.ludd.ltu.se |
chatting. OK
|
mail.ludd.ltu.se |
degraded
|
core network |
pushing packets. OK
|
member servers |
running. OK
thinlinc up, ssh up |
Userdata fileservers |
degraded. Crashing
|
userwww.ludd.ltu.se |
degraded
|
vortex.ludd.ltu.se |
membership system. degraded
|
status | Investigating |
scope | Mail, userdata, web services |
Description:
An issue with our storage servers are causing them to crash and hang, requiring physical intervention.
Impact:
The servers serve maildirs, userdata, userwww, as well as a few legacy virtual machines handling mail connections, userwww and more.
Update:
* 20200207 00:00 UTC+1
Userdata server starts crashing once a day.
---
* 20171208 18:00 UTC+1
Same symptom on gfs server serving VM data. Manual intervention required every time due to broken Out-of-bounds connection.
---
* 20171217 12:00 UTC+1
Crashing after server hardware changes, problem identified on 12-13 storage servers. Believed to be thermal issues OR kernel incompatability.
---
status | Scheduled |
scope | LCNet |
end_date | 2017-14-17 22:00:00 +0100 |
Description:
All racked servers will be moved and recabled between Thursday and Sunday.
Reason for this is to prepare the data center for new hardware.
Shelved LCNet servers should not be affected, but might.
Impact:
Downtime for all LCNet servers.
Risk for downtime on LCNet non-racked servers.
Progress:
status | Done |
scope | LCNet/non-racked servers |
Description:
All non-racked servers will be moved to a new shelf, and power and network will be recabled.
Reason for this is to prepare the data center for new hardware.
Impact:
Downtime for all non-racked lcnet servers
Progress:
20171208 19:00
Move started.
20171211 20:00
Done! All non-racked servers relocated. Rack servers without rackmounts moved to bottom of lcnet rack.
status | Fixed |
scope | New membership |
Description:
The certificate for weblogon.ltu.se has expired.
Impact:
This causes issues with verifying new members.
Update:
* 20171202 00:59 UTC+1
weblogon.ltu.se certificate expired.
---
* 20171202 18:00 UTC+1
Certificate was renewed.
---