Thanks everyone for your import. After some investigating / testing, I have found the culprit. First, the maximum potential throughput of a PCIe 2.0 lane is 500MBs after overhead. Here's a good article that details PCIe caps: http://www.tested.com/tech/457440-theoretical-vs-actual-bandwidth-pci-express-and-thunderbolt/
So IF the maximum potential of a four PCIe 2.0 lanes could be reached, you would expect a limit approaching 2000 MBs. However, there is one factor in this equation that I did not originally consider: the software. In this case, the software is the NVME driver, which is responsible to compressing data to / from the SM951. That overhead is reducing the overall throughput. To verify this, I overclocked / locked my CPU at 4.2 GHz, and my sequential read climbed to about 1430 MBs. I then underclocked / locked my CPU to .8 GHz, and my sequential read dropped to 1150 MBs.
Oh, well. I can deal with 1350 MBs for now. That limit will just give me motivation to upgrade in another year.