It could still be IO bound. Disk throughput is just one factor, there is also the wait times for the disks to serve the requests. On linux, use iostat -x to add the disk queue, wait, and service times to see if they increase significantly while your program is running.
It can even be a factor on SSD disks, if there are many small reads and writes. In this case the SSD may lag behind because internally it has a minimum block size that it must read or write. For example, if you are reading / writing 8k blocks, but the SSD has a 128k internal block size, the effective top throughput can be 1/16th of the SSD's top throughput.