When we talk about processing data in real time, it is easy to just write a program and be done with it.
The problems start piling up when we add analytics and volume.
A program is easy to write when it can process records sequentially. Once you reach the limit of this sequential processing, you start adding complexity that may represent the bulk of your work: You start by using multi-threading and eventually you need to also go to multi-processing to take advantage of multiple machines. It is much easier to use a framework to reduce those issues.
Still, a framework may give you the ability to distribute your processing but how easy is it to do? Now you want proper tools to assemble the many operations that you want to link together. Then, you also need to have the tools to easily identify bottlenecks so you can parallelize you operations. What about all the standard operations you would expect to be able to do?
This is where a platform comes in. It gives you the foundation for distributed processing but also gives you pre-built capabilities to interact with the outside world (files, message queues, databases, and so on) and also analytics so you don't have to reinvent the wheel.