Home › Forums › SharewareOnSale Deals Discussion › Perfect PDF 9 Editor / Sep 1 2020 › Reply To: Perfect PDF 9 Editor / Sep 1 2020
[@Peter Blaise]
>”Yeah, that’s not gonna happen”
Oh, it’s gonna happen alright, only that you are not the person that will be doing it now. You had a chance to show what Ashampoo PDF Pro v1.11 does but will not do it. Since that would be an easy test, I am beginning to think you must already know what the Text export looks like. So whether it can export to text/Unicode better than Perfect PDF 9 is a piece my review is missing. In that case, I have a solution. I didn’t want to have any other PDF editors installed, but I will get Ashampoo PDF Pro v1.11 and do the test myself, even if I have to purchase a license for the program. Your source is one of those that push the user into a “get it fast” for money, or “get it slow” for free downloads. I had rather not bother. Ashampoo has a special on the earlier versions anyway.
If Ashampoo PDF Pro v1.11 cannot export to text, then it cannot be considered an alternative PDF editor to Perfect PDF 9 (as you claimed).
>”you try text-exporting a PDF book with a numbered table of contents yourself, and see if Soft-Xpansion Perfect PDF 9 Editor decides it’s a spreadsheet and separates every word with tab-equivalent spaces, and share the results as I have”
Perfect PDF 9 Editor does not “decide it’s a spreadsheet” nor “and separates every word with tab-equivalent spaces.” That is not what “export to text” means.
>”I used a PDF of “An Elegant Defense” book by Matt Richtel”
All of the online sources I found for the book (“An Elegant Defense” book by Matt Richtel”) require a membership to download, which I am not interested in doing. The alternative is to purchase the electronic form of the book in PDF format for around $33, which I am not interested in doing just to test the same book you were using.
Instead, I chose something from the Gutenberg project that is free for anyone, and it is more complicated than the book you used. Not all the books available for download at the Gutenberg Project are in the PDF format, but some are. Those that are may or may not have images. I chose one that has a table of contents and a list of figures, plus the content of the book and the images. I cannot imagine anything more complicated than the one I chose.
The book I chose is “The Boy Electrician by Alfred Powell Morgan.” It is available in several formats. To get a complicated PDF file, I chose the “Generated PDF (with images)” edition. The book can be easily read as a PDF in any PDF viewer. This choice is a good one; if anyone else wants to do the same test, they will not have to shell out money to do so.
I have already done the export using Perfect PDF 9 as mentioned several times but I have not published the actual document because I do not have the rights to do so with the source I was using. The export to text was exactly what would be expected. I now have “The Boy Electrician by Alfred Powell Morgan.” book that I can use.
Soft Xpansion Perfect PDF 9 has two methods to export to text. If you choose from the Add-Ins Ribbon menu, you can select any number of PDF files (the file you have open already will not be included unless you add it to the list). When you have the list completed, you need to select a folder for output. The dialog that is used is the one that forces the user to start at the root and browse down to a specific folder. Once selected, the user is returned to Perfect PDF 9. Clicking the Next button starts the process. It does not tell the user what encoding will be used.
You can also use the File Save-As option to save the current file you have open as text. As part of that method, you can also add files if you want to export multiples. You do need to select the location to save the exported data, but the method is different in that it uses the common File Save As dialog. In the dialog, just below where you specify the name, there is an encoding choice. It states Unicode. Even though it is a pull-down menu, Unicode is the only choice.
Therefore, we know from the program that it exports Unicode text.
You have described the data “exported to text” as having “tab-delimited separate words in a text file” and “what appear to be tabs spaces between words” and now have changed your description of the text export as containing “tab-equivalent spaces.” There is no such thing as “tab-equivalent spaces” so you are describing your interpretation from sight (your interpretation of what it is) without knowing exactly what it is, even after I have explained that it is Unicode encoded text.
When describing your export experience, you mentioned that the page numbers were in the middle of sentences.
These clearly clarify that you do not know what “exported to text” means or what the output looks like when viewed with a text viewer or a hex viewer. You are expecting the output to be formatted like the source it came from. That happens only if you have a feature that exports to a document format that supports formatting, such as an MS Word document. Perfect PDF 9 does not do that. It is not a conversion tool.
When a PDF is exported to text, there is no formatting. Any text characters are written to the output when they are encountered in the source. Therefore, if a page number is found in the middle of the text that represents a sentence, then yes, it will look like the text output is not formatted, … because it isn’t supposed to be. It is just the text … that is all.
EXPORT as TEXT
When I exported the book using the Soft Xpansion Perfect PDF 9 editor, I got exactly what I expected to get … the text in Unicode.
It depends on what program you use to view the exported output that can affect whether you think the export is correct or not, but regardless of what you choose to view the output, you cannot make an accurate determination of what encoding was used. If you do not already know, you have to use some tool to investigate the text. There are several programs that will investigate a data file and give the encoding. These are very common on Linux systems, less common on the Windows platform, but I also have a good way for the Windows platform using a common program.
If you try some of the Word-equivalent editors, some may be able to open the file, others will not be able to. You never stated exactly what Word-equivalent editor you used, so I have no way of doing an exact comparison. Some Word-equivalent editors do not have the ability to recognize the Unicode text so it cannot open the file. For example, LibreOffice Writer throws up a dialog stating that the file is corrupt but asks if LibreOffice should try to fix it. If allowed to try and fix it, LibreOffice displays a second dialog stating that it cannot fix it, and therefore cannot open the file. That is good because otherwise, LibreOffice would be corrupting a file that was not corrupt to begin with.
A good text editor that understands different encodings will likely show the Unicode characters in two bytes, so naturally, they will appear to be separated by a space (this is why you cannot determine the encoding just by looking at the contents). It isn’t a space the editor is showing; it simply is the high-order bits, which do not represent characters in the 128 or 256 character sets we commonly see onscreen as Windows 1252 (often incorrectly referred to as ANSI) or UTF-8 encodings. The editor is showing the entire character as Unicode so, in the case of an English language text, it uses 2 bytes to show each character. Unicode is not limited to using only 2 bytes, but can also use 3 or 4 bytes. Due to the many different languages/characters in the world, more than 2 bytes are needed to have all of them represented. Unicode is very common in Europe.
If you use Microsoft Notepad to open the file, it depends on what encoding Notepad was using for the file opened before this one that affects how it displays the data. In many cases, the previous file likely would have been encoded as Windows 1252 (often incorrectly referred to as ANSI) or as UTF-8. The Open dialog will show the encoding that was used for the previous file. If it shows ANSI or UTF-8 or anything that is not Unicode, notice how that changes when you click on the Perfect PDF 9 exported text. It will instantly change to Unicode, but when the file contents are displayed, you may see that the characters all seem to be right next to each other or they may appear to have spaces before each character.
If shown all together, you do not see what you referred to as the “extra space” or “tab characters” and the other variants you used to describe the contents. You will notice the page numbers (initially lowercase Roman Numerals) are right next to other text. Later on, you will see page numbers right next to other text characters. This is because (as stated already) they are added to the output as found during the export to text process.
Notepad can also show the contents with what looks like extra spaces. It depends on what encoding Notepad worked with just before opening this Unicode file.
Notepad also allows you to save the file in various encodings. If you do any saves make sure each one is named so you know what encoding you saved it as. When you do the File Save As, the Save dialog also has an encoding display. It will show you what the current encoding is. You can change it to save the file to use a different encoding.
Not all conversions to other encodings work though. In one case, you will get a warning stating that the file contains Unicode characters that will be lost if the user continues to save the file. If you get that, try looking at that file in Notepad. It will appear as a lot of garbage characters.
One of the encoding choices for Saves is the Unicode big-endian. There is also a Unicode little-endian encoding, but it is not as common, so Notepad does not use it. Since 2 bytes are necessary to display the characters, The two bytes could be saved in either of the two orders. The big-endian and little-endian affect the order of the two bytes (is the 8-bit character that we commonly know as U.S. characters on the left or on the right).
Again, depending on what the previous file was encoded with can affect how Notepad displays the file. You can get a display that shows the “extra space” before each character, so no matter what you use to display the file, it may be displayed differently by each. Always use a tool that can investigate the data to know for sure. Notepad is a great tool for determining the encoding of text files on the Windows platform.
SYNOPSIS
The final synopsis is that you were not aware that a text file can contain data that requires more than one byte for each character, or that text files can have different encodings. Guess what … it still is a text file, a text file, a text file! Yes, I said text file!
>”I’m still awaiting responses from Soft-Xpansion, should all of us who asked for the SOS offer to be fulfilled but never got a response ( “yet” ) give up by now?”
I know that you have received two per your comments. If you requested more than the two you have mentioned previously, you have not given any details. If you did request more than two, what makes you think Soft Xpansion will continue sending you activation keys?
I have tracked each person’s comments, and I see only one that initially did not have success but after receiving an e-mail response was able to activate the program. I did see someone mention they sent an e-mail to Soft Xpansion, but there is no follow-up indicating whether they received a response. I do not see any users that have said that they sent an activate by e-mail method request to Soft Xpansion but did not receive a response. Therefore, there is no reason to speculate that they did not receive a response. When most people receive their response, they will activate the program, but not necessarily make any comments about it.
If you can point to a single comment that is still waiting for a response from Soft Xpansion, please do so. Maybe we can get them to clarify if it ever arrived because it could have arrived but the user never commented that the response was received. If anyone can identify that someone is still waiting on a response from Soft Xpansion, that would help with my review if the company itself offered an alternative to “activate via Internet” but not respond to the “activate via e-mail” method.
If anyone else sent an e-mail to Soft Xpansion but did not receive a response, they should post the details here. Otherwise, the public has no way of knowing. As far as “giving up now” each person has their own choice of how long to wait before they give up.
>”It’s not just text export. It’s everything else that Soft-Xpansion did wrong.”
Oh yeah, sure … and they have been doing it for over 25 years … successfully?