Do you mean integrate is writing a 1 GB file that will be consumed through gpfdist into Greenplum?
Or do you refer to a transaction file even though it is capture writing this file and not integrate?
Let’s start with the last scenario first: HVR writes transaction files containing one or more transactions. Transactions do not span transaction files. So if you see a large transaction file then generally that means a very large transaction was processed on the source system and lots of changes were relevant to your channel definition.
On the integrate side, when Greenplum is the target, HVR will write coalesced delimited files for Greenplum to consume. The size of the files is determined by the number of row changes per table. By default HVR will process up to 10 MB of (densely compressed) transaction files per cycle, but for for a target like Greenplum it generally makes sense to set /CycleByteLimit to either a high value, or 0, to take most if not all of the outstanding transaction files in a single cycle (to increase the efficiency of the apply i.e. increase the average number of rows per second processed). So when there are many outstanding transaction files and/or the bulk of the row changes are hitting a single table then you may temporarily see large files. HVR will automatically clean up the files again though.
Hope this helps.