The methods we have for aligning AIs are poor, and rely on the AI's being less cognitively-capable than people in certain critical skills, so the AIs you refer to as "aligned" won't keep up as the unaligned AIs start to exceed human capability in these critical skills (such as the skill of devising plans that can withstand determined opposition).
You can reply that AI researchers are smart and want to survive, so they are likely to invent alignment techniques that are better than the (deplorably inadequate) techniques that have been discussed and published so far, and I will reply that counting on their inventing these techniques in time is an unacceptable risk when the survival of humanity is at stake -- particularly as the outfit (namely the Machine Intelligence Research Institute) with the most years of experience in looking for an actually-adequate alignment technique has given up and declared that humanity's only chance is if frontier AI research is shut down because at the rate that AI capabilities are progressing, it is very unlikely that anyone is going to devise an adequate alignment technique in time.
It is fucked-up that frontier AI research has not been banned already.
Given we can use AIs to align AIs, I don't see why the methods we have rely on us having more cognitive capabilities than AIs in certain critical areas. In whatever areas we fall short relative to AIs, we can use AIs to assist us so we don't fall short.
We don't know if a supreme deceiver is aligned at all. If a model can think ahead a trillion moves of deception how do humans possibly stand a chance of scrutinizing anything with any confidence?
You can reply that AI researchers are smart and want to survive, so they are likely to invent alignment techniques that are better than the (deplorably inadequate) techniques that have been discussed and published so far, and I will reply that counting on their inventing these techniques in time is an unacceptable risk when the survival of humanity is at stake -- particularly as the outfit (namely the Machine Intelligence Research Institute) with the most years of experience in looking for an actually-adequate alignment technique has given up and declared that humanity's only chance is if frontier AI research is shut down because at the rate that AI capabilities are progressing, it is very unlikely that anyone is going to devise an adequate alignment technique in time.
It is fucked-up that frontier AI research has not been banned already.