ARTICLE AD BOX
I have a most bizarre problem: php include() statements are 'leaking' 5 newline chars (ASCII 10) to the output buffer.
I can isolate this to the incude() statement, as if I wrap it with ob_start() / ob_end_clean() then the 5x NL do not leak, i.e.
# A regular include() statement will result in 5x NL being sent to the OB include(datafile.inc.php); # Output starts with 5x NL, destroying sitemaps (invalid XML) # Wrap include() with ob_start() / ob_end_clean() ob_start(); include(datafile.inc.php); ob_end_clean(); # No NL leak to the OB. Resolves problemThis isn't an issue on HTML output with a file starting with <!DOCTYPE HTML> on line 6, but it entirely cooks sitemap XML output which requires the XML declaration to be on line 1 of the output.
The problem occurs on pages including dynamic content. Basically...
Page displays asking for input paramters
Input parameters entered or selected
Page posts to itself, takes the input, creates the dynamic element of the page based on the input, and then presents the page complete with the selected information based on the input inserted in step 2
The initial page (step 1) has no 5x NL, but the page with dynamic content (step 3) does. Working down the stack, the 5x NL come from the include() statement as explained above and confirmed by the ob_ function wrap to discard the spurious output, so this isn't some forgotten output somewhere in the page processing code, it's from the simple include() statment, 100%.
The include() simply defines an array of data used for page processing, e.g. datafile.inc.php looks something like:
<?php data[key1] = array( [key1.1] => 'a', [key1.2] => 'b', [key1.3] => 'c'); data[key2] = array( etc... ?>The next 'obvious' idea is there is some spurious output within this data file, but cutting down the content to just a single key-set still gives the same 5 NL. Even cutting it down to nothing more than:
<?php ?>still results in the 5x NL output! So it's not something in the file itself.
Next thought was file encoding. All files are in ANSI, but moving all the UTF-8 no BOM makes no difference. To be clear:
HTML 'template' file used to define the initial (step 1) and 'with results' (step 3) pages is ANSI
The php include() data file is ANSI
The php code file that presents the template page and builds the dynamic content on POST is in ANSI
So, all parts are in the same encoding. I mention ANSI, but saving all as UTF-8 no BOM makes no difference, but then this is the leak of 5x NL (ASCII 10), not a single BOM char.
I'm at a loss as to why/what is generating these 5x 'extraneous' NL chars. I can work around the issue with the ob_ function wrap of the data file include as shown above, but it still leaves me wondering WT... is going on and why/how an include() is generating NL chars to the output buffer. Any thoughts gratefully received 🙂
Edit: I have been doing further testing, but Dharman gave me the clue. Thank you. I have to admit I have always closed my php and generally try and keep HTML separate to php code for anything but trivial pages else you get into an almighty tangle of spaghetti, but the source of the extraneous were 5 blank lines at the end of the data file after the closing ?>. So, yes, both Markus Zeller and Dharman, it is indeed possible to omit the closing ?> and you have found my issue. Also, for all these years I've used include('filename') 😱
