First I made an assumption in my head that making the TCP timeout long like 15-20 minutes to allow the job to finish is bad from a security point of view. Thinking someone could use up all the connections in a DOS type of attack and those connections would never timeout. First, is that right and second is a keep alive pretty much the same?
It depends on the application. Long TCP timeouts do create entries in the connection table that sit for a long time. And yes - this can be abused to DoS a service. However, the LTM has an idle reaper - if the connection table memory reaches over 80%, the connection reaper kicks in and starts deleting the connections that have been idle the longest.
Keepalives keep resetting the idle timer, so the reaper has a harder job finding the oldest idle connection and is more likely to delete an active connection.
I guess third, is there a way to write an Irule that does like 3 keep alives and then lets it die or some other best way to handle it.
There is no irule command to send a TCP keepalive. I'd recommend removing the keepalive (as it interferes with the application) and shift to a longer TCP idle timeout. I suspect that will address the application issues. As long as the tcp profile is limited to the specific application, you will probably be not have any issues.