The only complicated bit about the psx cpu is the "tekken2 bug", which requires you to emulate the instruction pipeline.
Not only do you need to delay branches ( i.e. always execute the next instruction after a branch ) you also have to delay loads that occur in that instruction. It only matters if the instruction at the branch destination uses that register, pcsx does a complicated analaysis and generates code depending on that. For the MAME interpreter I just incorporated it into the branch delay code. I don't think any of the other open source emulators ever had it working properly.