Home Forums SharewareOnSale Deals Discussion Perfect PDF 9 Editor / Sep 1 2020

  • This topic has 91 replies, 1 voice, and was last updated 1 week, 2 days ago by Margarethux.
Viewing 15 posts - 76 through 90 (of 92 total)
  • Author
    Posts
  • #16494097 Reply | Quote
    Peter Blaise
    Guest

    2 things, [@Gary]:

    1 – No, Soft-Xpansion has not responded to emails for some of us, even by now, many days since our first installation attempts, and in the meantime, we do not even have a working demo trial copy on screen to explore and share about, so all we can share about is the failure to install and the failure of Soft-Xpansion to respond.

    2 – You say it’s worth it, but you do not say what about Soft-Xpansion Perfect ODF 9 Editor’s features and benefits you experience as worth it — could you tell us more?

    Thanks for exploring this and sharing.
    .

    #16494370 Reply | Quote
    Gary
    Guest

    [@Peter Blaise]

    As for me stating that it is worth it, I have mentioned that I have been using the PDF export to text feature. I also have used it to do several other things. I am writing about these one-at-a-time in my review, which I hope to have finished soon. I don’t like everything about the program, which is why I wanted to see what Ashampoo PDF Pro v1.11 does with its Export to text in Unicode encoding.

    1) You have shown us that you were able to export to Text; even though you did not understand the output, at least you were able to successfully activate the program, otherwise, you would not be able to export to Text at all.

    2) You have stated that an alternative is Ashampoo PDF Pro v1.11.

    Therefore, I am asking how does the output from Ashampoo PDF Pro v1.11 differ from what you got with Perfect PDF 9 for export to text with Unicode encoding. That should be a simple enough test to do and paste the results into a comment. I look forward to seeing how this one feature compares.

    #16495140 Reply | Quote
    Peter Blaise
    Guest

    No, [@Gary] I’m not “bragging”, I’m encouraging folks to explore and share.

    The alternatives I tried and shared about were available when I explored for alternatives, and still are available.

    Soft-Xpansion Perfect PDF 9 Pro is not.
    __________

    Using the search criteria you quote, I just also found Ashampoo PDF Pro v1.07, it says crack, but it’s not, it’s just an already licensed dll, no crack program to run, no viruses at [ VirusTotal. com].

    I also found Portable Movavi PDF Editor, so I’ll check that out, too.
    __________

    At every giveaway, I web-search for [ compare _____ ], such as [ compare Soft-Xpansion PDF Editor ], and I see what’s out there, before I even pursue the giveaway offer.

    I also read the vendor’s web site and see what free and trial versions they offer directly anyway.

    Everyone’s Google search returns something different for the same search variable because, hey, Google can.
    __________

    These threads are not about each other, not about you or me, but these threads are about the offering, sharing help, and comparing alternatives.

    Let’s do that.
    .

    #16495801 Reply | Quote
    Gary
    Guest

    [@Peter Blaise]

    It is great that you have found other PDF editors you can compare to Perfect PDF 9. The more the merrier. It will be interesting to see what the others produce for the Export to Text as Unicode encoding.

    My review is dependent on knowing as much as all the alternatives as possible but I don’t have other PDF editors installed, and the one I use does not export to text using Unicode encoding, so for me, Perfect PDF 9 has been a welcome tool, and to get it free from Sharewareonsale has been the best price.

    Since I know you already have Ashampoo PDF Pro v1.11, it seemed like a simple task to see what it produces when it exports to text using Unicode encoding. At least you can knock that one away. Then later, see what the others do and report on them too.

    #16496076 Reply | Quote
    Peter Blaise
    Guest

    Yeah, that’s not gonna happen, [@Gary], but you try text-exporting a PDF book with a numbered table of contents yourself, and see if Soft-Xpansion Perfect PDF 9 Editor decides it’s a spreadsheet and separates every word with tab-equivalent spaces, and share the results as I have ( I used a PDF of “An Elegant Defense” book by Matt Richtel ).

    I’m still awaiting responses from Soft-Xpansion, should all of us who asked for the SOS offer to be fulfilled but never got a response ( “yet” ) give up by now?

    It’s not just text export.

    It’s everything else that Soft-Xpansion did wrong.

    Including their one-liner, telling is to take it off the discussion thread.

    That’s something else that’s not gonna happen, either.
    .

    #16502350 Reply | Quote
    Gary
    Guest

    [@Peter Blaise]

    >”Yeah, that’s not gonna happen”

    Oh, it’s gonna happen alright, only that you are not the person that will be doing it now. You had a chance to show what Ashampoo PDF Pro v1.11 does but will not do it. Since that would be an easy test, I am beginning to think you must already know what the Text export looks like. So whether it can export to text/Unicode better than Perfect PDF 9 is a piece my review is missing. In that case, I have a solution. I didn’t want to have any other PDF editors installed, but I will get Ashampoo PDF Pro v1.11 and do the test myself, even if I have to purchase a license for the program. Your source is one of those that push the user into a “get it fast” for money, or “get it slow” for free downloads. I had rather not bother. Ashampoo has a special on the earlier versions anyway.

    If Ashampoo PDF Pro v1.11 cannot export to text, then it cannot be considered an alternative PDF editor to Perfect PDF 9 (as you claimed).

    >”you try text-exporting a PDF book with a numbered table of contents yourself, and see if Soft-Xpansion Perfect PDF 9 Editor decides it’s a spreadsheet and separates every word with tab-equivalent spaces, and share the results as I have”

    Perfect PDF 9 Editor does not “decide it’s a spreadsheet” nor “and separates every word with tab-equivalent spaces.” That is not what “export to text” means.

    >”I used a PDF of “An Elegant Defense” book by Matt Richtel”

    All of the online sources I found for the book (“An Elegant Defense” book by Matt Richtel”) require a membership to download, which I am not interested in doing. The alternative is to purchase the electronic form of the book in PDF format for around $33, which I am not interested in doing just to test the same book you were using.

    Instead, I chose something from the Gutenberg project that is free for anyone, and it is more complicated than the book you used. Not all the books available for download at the Gutenberg Project are in the PDF format, but some are. Those that are may or may not have images. I chose one that has a table of contents and a list of figures, plus the content of the book and the images. I cannot imagine anything more complicated than the one I chose.

    The book I chose is “The Boy Electrician by Alfred Powell Morgan.” It is available in several formats. To get a complicated PDF file, I chose the “Generated PDF (with images)” edition. The book can be easily read as a PDF in any PDF viewer. This choice is a good one; if anyone else wants to do the same test, they will not have to shell out money to do so.

    I have already done the export using Perfect PDF 9 as mentioned several times but I have not published the actual document because I do not have the rights to do so with the source I was using. The export to text was exactly what would be expected. I now have “The Boy Electrician by Alfred Powell Morgan.” book that I can use.

    Soft Xpansion Perfect PDF 9 has two methods to export to text. If you choose from the Add-Ins Ribbon menu, you can select any number of PDF files (the file you have open already will not be included unless you add it to the list). When you have the list completed, you need to select a folder for output. The dialog that is used is the one that forces the user to start at the root and browse down to a specific folder. Once selected, the user is returned to Perfect PDF 9. Clicking the Next button starts the process. It does not tell the user what encoding will be used.

    You can also use the File Save-As option to save the current file you have open as text. As part of that method, you can also add files if you want to export multiples. You do need to select the location to save the exported data, but the method is different in that it uses the common File Save As dialog. In the dialog, just below where you specify the name, there is an encoding choice. It states Unicode. Even though it is a pull-down menu, Unicode is the only choice.

    Therefore, we know from the program that it exports Unicode text.

    You have described the data “exported to text” as having “tab-delimited separate words in a text file” and “what appear to be tabs spaces between words” and now have changed your description of the text export as containing “tab-equivalent spaces.” There is no such thing as “tab-equivalent spaces” so you are describing your interpretation from sight (your interpretation of what it is) without knowing exactly what it is, even after I have explained that it is Unicode encoded text.

    When describing your export experience, you mentioned that the page numbers were in the middle of sentences.

    These clearly clarify that you do not know what “exported to text” means or what the output looks like when viewed with a text viewer or a hex viewer. You are expecting the output to be formatted like the source it came from. That happens only if you have a feature that exports to a document format that supports formatting, such as an MS Word document. Perfect PDF 9 does not do that. It is not a conversion tool.

    When a PDF is exported to text, there is no formatting. Any text characters are written to the output when they are encountered in the source. Therefore, if a page number is found in the middle of the text that represents a sentence, then yes, it will look like the text output is not formatted, … because it isn’t supposed to be. It is just the text … that is all.

    EXPORT as TEXT

    When I exported the book using the Soft Xpansion Perfect PDF 9 editor, I got exactly what I expected to get … the text in Unicode.

    It depends on what program you use to view the exported output that can affect whether you think the export is correct or not, but regardless of what you choose to view the output, you cannot make an accurate determination of what encoding was used. If you do not already know, you have to use some tool to investigate the text. There are several programs that will investigate a data file and give the encoding. These are very common on Linux systems, less common on the Windows platform, but I also have a good way for the Windows platform using a common program.

    If you try some of the Word-equivalent editors, some may be able to open the file, others will not be able to. You never stated exactly what Word-equivalent editor you used, so I have no way of doing an exact comparison. Some Word-equivalent editors do not have the ability to recognize the Unicode text so it cannot open the file. For example, LibreOffice Writer throws up a dialog stating that the file is corrupt but asks if LibreOffice should try to fix it. If allowed to try and fix it, LibreOffice displays a second dialog stating that it cannot fix it, and therefore cannot open the file. That is good because otherwise, LibreOffice would be corrupting a file that was not corrupt to begin with.

    A good text editor that understands different encodings will likely show the Unicode characters in two bytes, so naturally, they will appear to be separated by a space (this is why you cannot determine the encoding just by looking at the contents). It isn’t a space the editor is showing; it simply is the high-order bits, which do not represent characters in the 128 or 256 character sets we commonly see onscreen as Windows 1252 (often incorrectly referred to as ANSI) or UTF-8 encodings. The editor is showing the entire character as Unicode so, in the case of an English language text, it uses 2 bytes to show each character. Unicode is not limited to using only 2 bytes, but can also use 3 or 4 bytes. Due to the many different languages/characters in the world, more than 2 bytes are needed to have all of them represented. Unicode is very common in Europe.

    If you use Microsoft Notepad to open the file, it depends on what encoding Notepad was using for the file opened before this one that affects how it displays the data. In many cases, the previous file likely would have been encoded as Windows 1252 (often incorrectly referred to as ANSI) or as UTF-8. The Open dialog will show the encoding that was used for the previous file. If it shows ANSI or UTF-8 or anything that is not Unicode, notice how that changes when you click on the Perfect PDF 9 exported text. It will instantly change to Unicode, but when the file contents are displayed, you may see that the characters all seem to be right next to each other or they may appear to have spaces before each character.

    If shown all together, you do not see what you referred to as the “extra space” or “tab characters” and the other variants you used to describe the contents. You will notice the page numbers (initially lowercase Roman Numerals) are right next to other text. Later on, you will see page numbers right next to other text characters. This is because (as stated already) they are added to the output as found during the export to text process.

    Notepad can also show the contents with what looks like extra spaces. It depends on what encoding Notepad worked with just before opening this Unicode file.

    Notepad also allows you to save the file in various encodings. If you do any saves make sure each one is named so you know what encoding you saved it as. When you do the File Save As, the Save dialog also has an encoding display. It will show you what the current encoding is. You can change it to save the file to use a different encoding.

    Not all conversions to other encodings work though. In one case, you will get a warning stating that the file contains Unicode characters that will be lost if the user continues to save the file. If you get that, try looking at that file in Notepad. It will appear as a lot of garbage characters.

    One of the encoding choices for Saves is the Unicode big-endian. There is also a Unicode little-endian encoding, but it is not as common, so Notepad does not use it. Since 2 bytes are necessary to display the characters, The two bytes could be saved in either of the two orders. The big-endian and little-endian affect the order of the two bytes (is the 8-bit character that we commonly know as U.S. characters on the left or on the right).

    Again, depending on what the previous file was encoded with can affect how Notepad displays the file. You can get a display that shows the “extra space” before each character, so no matter what you use to display the file, it may be displayed differently by each. Always use a tool that can investigate the data to know for sure. Notepad is a great tool for determining the encoding of text files on the Windows platform.

    SYNOPSIS

    The final synopsis is that you were not aware that a text file can contain data that requires more than one byte for each character, or that text files can have different encodings. Guess what … it still is a text file, a text file, a text file! Yes, I said text file!

    >”I’m still awaiting responses from Soft-Xpansion, should all of us who asked for the SOS offer to be fulfilled but never got a response ( “yet” ) give up by now?”

    I know that you have received two per your comments. If you requested more than the two you have mentioned previously, you have not given any details. If you did request more than two, what makes you think Soft Xpansion will continue sending you activation keys?

    I have tracked each person’s comments, and I see only one that initially did not have success but after receiving an e-mail response was able to activate the program. I did see someone mention they sent an e-mail to Soft Xpansion, but there is no follow-up indicating whether they received a response. I do not see any users that have said that they sent an activate by e-mail method request to Soft Xpansion but did not receive a response. Therefore, there is no reason to speculate that they did not receive a response. When most people receive their response, they will activate the program, but not necessarily make any comments about it.

    If you can point to a single comment that is still waiting for a response from Soft Xpansion, please do so. Maybe we can get them to clarify if it ever arrived because it could have arrived but the user never commented that the response was received. If anyone can identify that someone is still waiting on a response from Soft Xpansion, that would help with my review if the company itself offered an alternative to “activate via Internet” but not respond to the “activate via e-mail” method.

    If anyone else sent an e-mail to Soft Xpansion but did not receive a response, they should post the details here. Otherwise, the public has no way of knowing. As far as “giving up now” each person has their own choice of how long to wait before they give up.

    >”It’s not just text export. It’s everything else that Soft-Xpansion did wrong.”

    Oh yeah, sure … and they have been doing it for over 25 years … successfully?

    #16502606 Reply | Quote
    Gary
    Guest

    [@Peter Blaise]

    I should have put this up first. I reworked it to leave out the parts that I also had in my last comment.

    When you made your statement about the output was supposed to be a text file, a text file, a text file, I asked “What encoding did you export the PDF file as?” to see what you knew, even though it was clear you did not know what you were viewing.

    >”I did not choose any encoding for a text file export,”
    That is correct because you can’t. Soft Xpansion exports only in Unicode. In one method it shows that the output is Unicode.

    >”and cannot imagine the meaning of any coding for a test file export”

    I am sure you truly think like that. If you knew about different encodings for text, you would have instantly recognized that the output is not using 1 byte per character. Now you know that not all text files are 1 byte per character.

    >”or anything I could have done or chosen for a text file export that would have put what appear to be tabs spaces between words causing them to appear as if in spreadsheet columns.”

    So from stating that the exported output had “spaces as tab-delimited” a few days ago to “what appear to be tabs spaces” today. Are you starting to accept that they are not tabs?

    UNICODE EXPLANATION:

    Text file does not imply any encoding at all. It simply means the contents do not have any formatting included such as an MS Word document would have. A text file does not imply that the contents are encoded as 1-byte characters or uses ASCII, EBCDIC, or the incorrect terms ANSI, Windows Standard encoding, or MS-DOS standard encoding or any other encoding or formatting. Neither does it imply that the content CANNOT be 2-byte, 3-byte, or 4-byte characters, or that those characters cannot be considered as UTF-8, UTL-16, or UTF-32, or called Unicode.

    The US alphabet has 26 characters, upper and lower case variants, punctuation marks, etc, and all can fit within 128 placeholders, which was the earliest Personal Computer’s character sets, even on mainframes/super-minis, and referred to as ASCII (as opposed to EBCDIC like on your IBM mainframe). Each byte is a binary number, but what that number represents depends on the character set, which is an agreement that each binary number stands for some specific character (even unprintable ones). 128 variants can be formed using only 7 bits, so the 8th bit of a byte was left as a zero. When the IBM PC came along, they used the 8th bit to add another 128 characters (playing card symbols, line drawing shapes, etc), which was often referred to as Extended-ASCII [Note: IBM wanted to use the Epson MX80 printer, but it communicated using 7 bits at a time. The design was modified to store the extra 128 characters and use 8-bits to communicate, so the IBM Printer was the original Epson MX80, modified, and had the IBM logo on it.]

    As other countries adopted PCs, they wanted to have their characters be available too, but now all 8-bits of a byte were already in use. That means more than 1 byte would be needed to add support for languages such as Spanish, French, Italian, German, and others. Soft Xpansion is based in Germany, and all over Europe, there is much more of a need to have support for all languages worldwide. Therefore, it is no wonder that Soft Xpansion exports text only as Unicode. The first 128 characters of Unicode are the same as the ASCII characters. If the file used only those characters, it is easy to convert to UTF-8 or the incorrect term ANSI (actually Windows 1252) (UTF-8 and Windows 1252 are different).

    Since we already know that we are out of space with 1 byte used to store each character, more bytes are needed to contain Unicode, so what do you think Unicode looks like in a standard text viewer when the text contains only characters used in the U.S. alphabet?

    IT LOOKS LIKE THERE IS AN EXTRA SPACE BETWEEN EACH CHARACTER!

    And someone that doesn’t know any better may think they are tabs. Granted, if you are not using a standard text viewer, it may collapse the leading bits and show the characters we are accustomed to seeing right next to each other.

    >”I guess that Soft-Xpansion Perfect PDF 9 Editor saw a table of contents with numbered chapters and decided the contents was a spreadsheet, but I’m guessing”

    Don’t guess. Text files don’t have any formatting, so Perfect PDF 9 will not try to line items up in a column by adding tabs. Programs that convert PDFs to formats that are formatted (e.g., MS Word documents), will attempt to keep the original formatting in the output. That does not happen for exported text.

    Take a look at your exported text using a Hex Editor/Viewer.

    >”So, has anyone else tried exporting to text?”

    Randy stated to you that he exported two PDFs to text and they were just fine. I succeeded in exporting to text, and you succeeded in exporting to text even though you did not understand what you were looking at, so that makes at least three of us. Hopefully, some others will too.

    >”… it just confirms for me that Soft-Xpansion does not understand what a human understands when we look at the contents of a PDF file.

    Look at it how? Using what program? You cannot look at the contents of a text file and know what encoding it uses.

    >”Soft-Xpansion does not understand”

    And yet, Soft Xpansion has been in the business for over 25 years, and is considered one of the top companies when it comes to PDF. They have to have a good idea of what a human understands when they look at the contents of a PDF file. Without understanding the PDF format, it might seem like an odd sequence of characters. If you mean look at the “text exported from a PDF file,” then they also know that depending on what you use to view the output will affect what you see and therefore should not be the basis of determining what encoding is in use, and they now know that some people in the U.S. have no clue what Unicode is or that it can be in a text file.

    >”Let’s wait for version 11 or later, with no registration needed for it to work on screen at least as a trial.”

    “Let’s not” At least, don’t include me. I would suggest that YOU wait for version 111. Soft Xpansion will probably be thrilled if you do. The rest of us that were interested enough to start the download process of Perfect PDF 9 would probably like to be able to use Perfect PDF 9, and hope that they (Soft Xpansion) are not so discouraged with their experience in this offer that they will not return to give SoS users another chance for their software in the future.

    >”Thanks in advance for your own report of testing text export.”

    No need to thank me in advance, I already did it. As stated all along, as soon as the SoS offer download page had a code, I have been exporting to text using Perfect PDF 9. I am glad to get Unicode exported. Most PDF editors do not export in Unicode. When I need Unicode, Perfect PDF 9 saves me from having to process the exported output through another program. I might have more in my review, which is coming along fine. I did have to do other things but will be finishing that up.

    #16505368 Reply | Quote
    Peter Blaise
    Guest

    Oh, this is fun, the gift that keeps on giving, like my book clubs, where the worst books generate the most vigorous conversations.

    We’re all having fun, right?
    __________

    Alternatives:

    No, I do not offer alternatives as if I have compared them to the current offering, but only to suggest alternatives for us to explore – great that others feel engaged enough to do a one-for-one comparison, more, please … and thanks.
    __________

    PDF to text and Unicode versus tabs / whitespace tab equivalents:

    … depends not only on the source PDF, but on the presence or lack of sophisticated intelligence of the programmers.

    Scrutinizing source PDFs ignores the programmer’s ability or inability to interpret the presentation of PDF content as a human interprets the presentation.

    That is, I do not care if there are invisible non-printing codes between letters or words or page elements, or if letter are stores in any particular font or unicode or multicode or bols markers for each letter versus one bald marker for an entire word or headline of multiple words.

    If Soft-Xpansion programmers care, and try to make it my business, then that says possibly a variety of things, including they are selling to PDF publishing houses who are the source of their own PDFs, not end users who are dealing with PDFs from any source.

    Line breaks mid sentence, and page numbers mid sentence, are only one part of the PDF-to-text presentation I shared, the most critical challenge to understand was the tab-delimiting between words.

    Whether “tab” is a character or is a presentation effect accomplished by any other means does not concern me.

    No, the characters were not separated as if 2 bytes instead of one, the words were separated as if laid out in spreadsheet column.

    Note, I shared text output for all to inspect – look at it, it’s just cut-and-paste – and I cited the source, so go ahead and dive in, anyone who wants to explore.

    I note that tab-equivalent-delimited results are unrelated to Unicode, that is, the spaces between words are variable, but exactly what is needed to align the next word with the previous row of word’s alignment, that is, a column, a spreadsheet, tab-delimited, by spaces or tabs or whatever it takes to line up the best word regardless of the length of the current word, but hey …
    __________

    Free resources, because, hey this is an SOS freebie thread.

    I shared a source that was free to me, and not available in other formats.

    “The Boy Electrician” book by Alfred Powell Morganis is freely available side by side with other freely available file formats.

    The “The Boy Electrician” PDF is not original from the publisher, but was generated, the PDF is not a product of original publisher, and there is no need to export text from it.

    As mentioned, an original publisher probably can purposefully insert non-printable codes that sully the PDF’s auto-conversion to text, I do not know, nor do I care.

    I get free ebooks from the library, at least, some are in PDF format only, plus there are a gazillion resources out there, and the internal format is dependent on variables we cannot know, considering the various re-editing and re-formatting and “cleaning up” done between any original sources versus the copy we can get our hands on – again, I do not know, nor do I care.

    My point, and I do have one, is that Soft-Xpansion Perfect PDF 9 Editor seems to scrutinize the internal coding structures, and miss the human-interpretable presentation.

    Who knows, maybe the PDF I started with has a code between words, and Soft-Xpansion honored that … it does not matter, really.
    __________

    Soft-Xpansion never replying to email request ( yet ):

    Yes, Soft-Xpansion might have interpreted some email requests as non-legitimate, and then decided to not reply … who knows, maybe someone tripped over a power cord at their workspace and they lost some emails in process.

    In the cases I experienced, they would be wrong to not respond and empower folks to try out their old software in order to get familiar with the terrific customer service capabilities of their software and of their support, and they have missed opportunities to present their software to SOS participants regardless of their reasons for not responding ( yet ).
    __________

    These are just tools, and if they work for anyone anytime for any purpose, great.

    That is, if we can actually install them and get them working on screen.
    __________

    We’re having gun, right?
    .

    #16505370 Reply | Quote
    Peter Blaise
    Guest

    Make that “fun“, as in:

    We’re having fun, right?

    ( I hate spill chick )
    .

    #16512770 Reply | Quote
    Peter Blaise
    Guest

    For late comers, or the hopeful, now that the official v9 SOS offer is over, there’s a trial version of Soft-Xpansion Perfect PDF 11 Premium at:

    [ https :// soft-xpansion. com/products/perfect-pdf-11-premium/ ]

    … well, not really available, we have to register with Soft-Xpansion and they email us a link to a trial download, that’s how they play it.

    Note it also has the same challenges and noted throughout this thread, in that:

    – prior version must be removed, uninstalled, even working full versions, all must go, just to try the new version,

    – if it does not find it’s secret portal to phone home ( phoning home failed for some of us ),
    then:

    – it does not work in trial mode,

    – any previous version also does not work because, hey, we just removed, uninstalled any previous version,

    – it will not work until all of the below are satisfied, time frame up to Soft-Xpansion, not in our own control:

    — we ourselves must keep separate track of and coordinate these 5 things that Soft-Xpansion does not keep track of:

    — our specific email ( some of us have more than one and or help others try to install ), one email will marry one computer, one computer code, one activation number and one license file,

    — our personal way of identifying our specific computer ( some of us have more than one computer or help others try to install ), one computer will marry one email, one computer code, one activation number, and one license file,

    — our specific computer’s specific identification code as calculated by the Soft-Xpansion ( they do not say what computer the code applies to, so we must keep track and coordinate by our own method ), one computer code will marry one email, one computer, and one activation number,

    — any specific Soft-Xpansion activation number ( again, tethered to a specific email and a specific computer by Soft-Xpansion’s identification, and tethered to a specific computer by our own identification method ), one activation number will marry one email, one computer, one computer code, and one license file,

    — a specific license file sent to us ( some of us never got one ) for our specific computer and our specific activation number, one license file will marry one email, one computer, one computer code, and one activation number.

    Yeah, Soft-Xpansion is outdoing even Franzis!

    Please, let us know if you try, and please let us know how it works, and what you think, especially comparatively.
    .

    #16519263 Reply | Quote
    Gary
    Guest

    [@Peter Blaise]

    >”Line breaks mid sentence, and page numbers mid sentence, are only one part of the PDF-to-text presentation I shared, the most critical challenge to understand was the tab-delimiting between words.”

    When a sentence starts on one page and ends on the next, where do you think the line break should be placed in the exported text output? Should it be before the start of the sentence on the starting page? Or, should it be after the end of the sentence on the following page? Either way, you could not go on that rule alone. What if you choose to place the linefeed before the start of the sentence, but the start is a very long part of the page, but the following page has a very short part to finish the sentence. Do you choose instead to place it at the start if the start is shorter but at the end if the end is shorter? What if the sentence carries over multiple pages? Do you accumulate linefeeds, also, same for page numbers? The simple text export places the linefeed or page-numbers in the output in the sequence they are found. It does not make judgments.

    What you want is a conversion program in the same format as the source, only that the output is text characters.

    The exported text is not in the form of any type of formatting as you seem to expect in spite of me reiterating that the exported text output is nothing more than the text characters in the same sequence that they are encountered.

    >”Whether “tab” is a character or is a presentation effect accomplished by any other means does not concern me.”

    The tab character is not a common dependable method of formatting text in programs such as MS Word. It is used mostly in text documents to line things up or form indention. Perfect PDF 9 does not try to format the exported text.

    >”No, the characters were not separated as if 2 bytes instead of one, the words were separated as if laid out in spreadsheet column.”

    Correct, the characters were not separated. They were continuous, only that it took 2 bytes for each character. In Unicode, since characters of the US English language does not need the first eight bits, what you get appears to be spaces. They are not.

    >”Note, I shared text output for all to inspect – look at it, it’s just cut-and-paste – and I cited the source, so go ahead and dive in, anyone who wants to explore.”

    >”it’s just cut-and-paste – and I cited the source”

    To be able to paste, you have to copy something to the clipboard first. You left out one important part of what you presented in your comment. You did not cite the source program displaying the data you copied from. You only cited the original book PDF, which was then exported to Unicode text.
    Next, you viewed that Unicode text using some program but you have not said what program it was.
    Whatever that program was can affect how the text appears, and if it manipulates the data to allow you to see the text characters in some manner, then you are not copying the original Unicode data. Therefore, pasting that into a comment is meaningless to convince anyone what the original Unicode contained. That is what magicians do to convince the viewer of something. They do not tell you all of the middle steps they took.

    What program were you using to display the text that you copied from?

    As I pointed out, depending on what mode of encoding Microsoft Notepad was last used as, it can display the contents of the same Unicode file with the characters right next to each other (1 byte) or appear to be separated (2 bytes). You cannot go on what you see with even the simplest tool such as Notepad. You mentioned using a “Word-equivalent” program. That is a poor choice for viewing the exported Unicode text data.

    As I mentioned, I exported “The Boy Electrician” to the Unicode text, and received what I expected. You may be wondering why I did not post any of the output. Even though I might have Unicode text that has not been altered, the data would get manipulated as soon as I attempted to post it into a comment. Ever heard of script injection/code injection?

    When typing or pasting anything into a web comment form, do you think webpages just accept any data from the forms? For a page to accept anything placed in a form is an excellent way of allowing any user to attack the server. There are many warnings online about these dangers, and how web developers need to handle unknown input. It is normal behavior to strip the entered data down to basic characters (especially bytes of unprintable characters, or the high bits in Unicode 2-byte characters for English), so what the user sees will never be the same as the Unicode it came from. Likewise, what we see from what you pasted into the comment form is not the exported Unicode text it supposedly started from.

    All your effort was a moot point from the start. Either you didn’t have a clue about what you were seeing or doing, or you might be trying to scam the reader. You may be convincing to the inexperienced, but don’t expect that to work with everyone.

    You cannot determine if the exported text is correct or not by looking at it using any tool. You can only use a tool to tell you what encoding it is. Then, if you have the means (or tool) to interrogate the data byte by byte (e.g., hex editor), only then can you see what the contents are. Unicode files should have a few bytes at the start of the file that identifies what type of encoding the data is in. As soon as you copy any characters or bytes further on, you then only have a string of characters. They alone cannot be used to judge or determine what type of encoding was used without some other interrogation. In some cases, it is impossible to determine what encoding the string of characters came from. If the program you used to copy from was not made to handle Unicode text, you could have started copying in the middle of a 2-byte character. Without knowing what program you used to view the Unicode file, and where you started your copy, we have no idea what that content would be.

    >”I note that tab-equivalent-delimited results are unrelated to Unicode, that is, the spaces between words are variable, but exactly what is needed to align the next word with the previous row of word’s alignment, that is, a column, a spreadsheet, tab-delimited, by spaces or tabs or whatever it takes to line up the best word regardless of the length of the current word, but hey …”

    Export to text is not for the purpose of making anything line up or be formatted. If the result does line up, then it is a coincidence due to the original PDF having those characters in the same order or the tool you used has manipulated the text to have tabs.

    __________

    >”I used a PDF of “An Elegant Defense” book by Matt Richtel”
    >”I shared a source that was free to me, and not available in other formats.”

    The book is available on Amazon as a hardcover, paperback, audio CD, Audiobook, and Kindle so it is available in different formats. It is NOT available from Amazon as a PDF, so I am pretty sure the PDF edition you used was “generated” by someone.

    >”you try text-exporting a PDF book with a numbered table of contents yourself, and see if Soft-Xpansion Perfect PDF 9 Editor decides it’s a spreadsheet and separates every word with tab-equivalent spaces, and share the results as I have”

    I did do an export and it did not separate any words with a tab (what you are calling tab-delimited spaces). In the entire output, there is not a single tab.

    I did not attempt to show in a comment because I know better. Apparently, you don’t even realize that what you did is meaningless.

    >”“The Boy Electrician” book by Alfred Powell Morganis is freely available side by side with other freely available file formats.”

    That is correct. I did not choose the book based on whether there were other formats available, same for your choice, whether you knew it or not (“An Elegant Defense” book by Matt Richtel” is also available in other formats). Being available in other formats has nothing to do with the issue of whether Soft Xpansion Perfect PDF 9 can export a book with a table of contents, page numbers, etc., and how it handles it. I simply was using a freely available “PDF book with a numbered table of contents,” in order to have a source that had the same things you mentioned that was in your book. Therefore, if Perfect PDF 9 exports the table of contents in columnar form, it would show up. It didn’t. Since you did not use a tool that can view Unicode accurately, it is likely the reason you have tabs. Use a hex editor/viewer and show us the source. If the source has tabs, then they simply were passed on to the output, but with no intention of formatting that output.

    >”The “The Boy Electrician” PDF is not original from the publisher, but was generated, …”

    What? … what kind of curveball is that? What PDF files are not generated? You act as if the book you used was not generated? I have never heard of a PDF book that was not generated. It would mean the book was composed entirely in a PDF editor instead of something like MS Word. Possible, but the formatting abilities of PDF editors have not quite caught up to the same as MS Word. Even if created entirely in a PDF editor, the final step is to “GENERATE THE PDF” so where are the PDF books that are not generated hiding?

    Most likely the author used something like Microsft Word, or a “writers tool” to compose the book. It probably went through several iterations of editing and proofreading before deciding the book was at the publishable state. After it was prepared for typeset, they also made digital versions to send to the printer. They may have even generated a PDF to send to the printer. Either way, at some point, someone made a PDF version. Whether that was done by the original publisher or not, is not known, nor does it matter when it was “generated” or by whom.

    If you look at the PDF edition of The Boy Electrician, the pages look like the original book. That is because the original book was OCRed, then the output of the OCR corrected, and formatted to create a source that could be used to typeset/print as near a copy of the book as the original. The PDF is just one of the end results of those steps. The book has the same qualities that your book had and that you specified as being critical parts to see what the Perfect PDF 9 export to text looked like. Again, whether other formats are available or not has nothing to do with the issue being debated.

    I used Perfect PDF 9 to export “The Boy Electrician” to text, and it worked as expected. It was all in Unicode, and by the way, there was not a single tab character in the entire output.

    >”the PDF is not a product of original publisher, ”

    Correct, and again that has nothing to do with it. We are testing a PDF with a table of contents; it does not matter if the book is the latest on the market or from 1913. For both, the PDF editions are recent. There was no need for the PDF to be a product of the original publisher (your PDF may not have been either). That was not in your original stipulation, and if it was, I would have a good laugh. Perfect PDF 9 does not know whether it is a product of the original publisher, nor does it care. That has nothing to do with whether Perfect PDF 9 can export a PDF with a table of contents with or without tabs.

    >”and there is no need to export text from it.”

    Again, where are these odd curve-balls coming from? What does “need” have to do with it? Did you forget the issue is whether Perfect PDF 9 is doing the right thing when exporting to a text file? “Need” has nothing to do with it.

    If you do not like using “The Boy Electrician” pick something else, as long as it is freely available to all, and easy to download. That way, anyone can do the same test.

    You mention you get PDFs from different sources and that there are a gazillion places in the Internet that has PDFs available, so use several of those other PDFs and see what you get. Don’t base your entire judgment on one example. You owe it to yourself to see if Perfect PDF 9 handles all of the books the same way. When you can prove that all PDFs with a table of contents come out the same way, then you have something. As it is now, you have nothing.

    >”As mentioned, an original publisher probably can purposefully insert non-printable codes that sully the PDF’s auto-conversion to text, I do not know, nor do I care.”

    No, an original publisher probably CANNOT purposefully insert non-printable codes that sully the PDF’s auto-conversion to text. It wouldn’t matter if they insert any characters or not. It doesn’t matter what the original contained besides the actual text content because only the text is exported; other characters are ignored. Perfect PDF 9 does not claim to be attempting to produce a formatted version of the original PDF.

    It seems like you were expecting text output that you could open with something like MS Word and have a source file that could be used to produce a PDF that looks the same as what the exported text came from. Perfect PDF 9 does not do that. When asked, they answer that it is not for that purpose.

    PDF tools are not limited to what they can do, so some PDF editors do export to MS Word format. I am sure in the future, there will be more and more universal PDF tools that do it all.

    >”My point, and I do have one, is that Soft-Xpansion Perfect PDF 9 Editor seems to scrutinize the internal coding structures, and miss the human-interpretable presentation.”

    It doesn’t matter what it might “seems like” they are doing; that is something you evolved in your head. What is important is what they claim to do and what they actually do. They do not claim to attempt generating formatted text; merely export the text characters found. And every case I have tried that is exactly what they have done.

    They are not scrutinizing the internal coding structure to produce a human interpretable presentation but miss. They are merely collecting the text characters found in the original, and dropping the rest. That’s what “export to text” means in its simplest form. Therefore, they are not missing something they are not trying to do in the first place. It is you who has evolved the thought in your mind that they are attempted to create a formatted output, similar to what a conversation to MS Word would look like, only that it would be plain characters. They absolutely have stated that they are not doing that.

    >”Who knows, maybe the PDF I started with has a code between words, and Soft-Xpansion honored that … it does not matter, really.”

    No, Perfect PDF 9 is simply writing out the characters found in the source.

    >”Who knows”

    Rather than continuing to guess, look at your source with a hex editor and see what is there. If you want to paste your hex codes found into a comment, that should work. If you use a hex display that shows the actual characters plus the hex values, it isn’t going to work. Paste only the representation of the hex values (simple ASCII text).

    I don’t know why you based your entire claim on exporting one PDF file when you have plenty of other PDFs to see if they all do the same. Anyone claiming to be a “computer tech” would have done that instantly instead of blaming a company and their product that has been around for years. It would be bad enough to come up with such an untested theory without saying anything publically. To put out there so everyone can see speaks volumes.

    #16520742 Reply | Quote
    Alberto Acosta
    Guest

    Excelente aplicación

    #16520749 Reply | Quote
    Alberto Acosta
    Guest

    Excelente App

    #16526489 Reply | Quote
    Peter Blaise
    Guest

    What’s your point, [@Gary]?

    Soft-Xpansion Perfect PDF 9 Editor
    – failed to come on screen for some of us,
    – even in trial mode,
    – and the vendor has still not replied to our emails ( yet ).
    __________

    Features and benefits wise, the program appears to be not very special or smart.

    I chose a free PDF to explore, a PDF direct from the publisher via a local free library electronic archive, a PDF I already had in my posession, a PDF by the publisher that contains direct text, not OCR from an scanned image, the PDF was not independently converted from another document type to PDF.

    You chose a PDF that was generated ( their word, ask the Gutenberg project to explain ), independently converted from another document type, a PDF from a list that included a text version ( hence no need to export PDF-to-text, the text version is “clean” of interruptive page numbers, footers, and headers, in the middle of sentences ).

    Regardless, Soft-xpansion Perfect PDF 9 Editor export to text results for both of us were unaware of the human-perceived meaning of the contents.

    You accept that without comment.

    I do not accept that, and commented that it would take 300+ pages of editing to eliminate the debris, suggesting alternatively cropping the pages to their content minus the marginalia before exporting, or programming macros in Word-equivalent to clean up debris, or some other such scheme.

    Or opening any PDF in any PDF reader, like free Google Chrome, and highlighting the text, and then copying-and-pasting, which works fine for both of our PDF samples – who needs an export function in a PDF reader, especially if it is even less intelligent than simple copy-and-paste?

    Here’s a sample of cut and paste from “The Boy Electrician” generated PDF across 2 page numbers, why waste your time when a clean text version is immediately available side-by-side at the Gutenberg website that has no page number interruptions in the middle of sentences?

    by comparing the force which gravity exerts in pulling it to the earth with the
    same effect of gravity on another standard ”weight.”
    Electric current is invisible and weightless, and for these and other reasons
    cannot be measured by the quart or weighed by the pound. The only way that
    it can be measured is by means of some of the effects which it produces. Either
    the chemical, electro-magnetic, or the heating effects may be made the basis of a
    94
    system of measurement.
    The first method used to measure electric current was the chemical one.
    If a current is passed through a solution of a chemical called coppersulphate
    (blue vitriol) by means of two copper plates, copper will be deposited on one plate
    and dissolved from the other. If the current is furnished by a battery the copper
    will be deposited on the plate connected with the zinc of the battery. If the current
    is allowed to flow for a short time and the two copper plates are then taken out
    and weighed it will be found that one plate is considerably heavier than the other.
    The copper has been taken from one plate and deposited on the other by the
    electric currents. The amount of electric current which will deposit 1.177 grammes
    of copper in an hour is called an ampere. The ampere is the unit of electrical
    current measurement, and implies quantity or amount.
    The chemical method of measuring current was at one time put to practical
    service in the distribution of electric current for lighting and power. Many years
    ago the house meters, used to measure the current, consisted of a jar containing
    two copper plates. The current used in the house would cause copper to deposit
    on one plate, and by weighing the plate the power company could determine
    the amount of current used, and thereby the amount of the bill. The meters
    nowadays make use of the magnetic effects of the current instead of the chemical,
    as described later on.
    The Volt
    For purposes of explanation the electric current may be likened to a stream of
    water flowing through a pipe.
    If you hold your thumb over the end of a water-pipe through which water
    is flowing it will push your thumb away because of the pressure which the water
    exerts.
    Electric currents also exert a pressure, only it is not called pressure in electrical parlance, but, spoken of as electromotive force or potential.
    The pressure of the water enables it to pass through small openings and to
    overcome the resistance offered by the pipe.
    Wires and other electrical conductors do not offer a perfectly free path to
    an electric current, but also possess a resistance. It is the potential of the electromotive force which overcomes the resistance and pushes the current through the
    wire.
    Advantage has been taken of the fact to fix a unit of electrical pressure
    called the volt. The pressure of the water in a water-pipe is measured in pounds,
    but the pressure of an electric current in a wire is measured by volts. The volt
    is the unit of electrical force which will cause a current of one ampere to flow
    95
    through a resistance of one ohm.
    The Ohm
    The ohm is the unit of electrical resistance. The standard ohm is the resistance
    offered by a column of pure mercury having a section of one square millimeter
    and a length of 106.28 centimeters at a temperature of 0° centigrade.
    The pressure which will force sufficient current through such a column of
    mercury to deposit 1.177 grammes of copper in one hour is a volt, and in doing

    No tab spaces between words, line endings meaningless, page numbers in the middle of sentences.

    You don’t mind, I do.

    Not-free Soft-Xpansion Perfect PDF 9 Editor is not needed, free Google Chrome works just fine to export text from a PDF above, plus, as noted, a much cleaner text version was already immediately available right next to the PDF.

    Why do you think that your speculating that Soft-Xpansion Perfect PDF 9 Editor exports each character as it finds it, is superior speculation to my speculation that it converts code from within a PDF to whatever the programmers told it to convert, including non-printing internal PDF codes?

    By the way, free Google Chrome exported text from a PDF of “An Elegant Defense” book by Matt Richtel much faster, easier, and more usefully accurately via cut-and-paste, WITHOUT TAB DELIMITING BETWEEN WORDS, compared to Soft-Xpansion Perfect PDF 9 Editor’s text export.

    Yeah, free Google Chrome is better.

    As I wrote, Soft-Xpansion Perfect PDF 9 Editor appears to be a capable program, at least as capable as a 10-year-old version of Adobe Acrobat Pro MINUS OCR, PLUS QUIRKS, and as such probably can be useful for direct editing of a PDF … and that makes it not special, not competitive, not contemporary, not trustworthy ( the quirks ), but of someone has no other PDF editor, it’s better than nothing ( except free Google Chrome for neater cleaner faster text export, apparently ).

    That the company has been in business for 25 years is meaningless, and does not mean they are appropriate for any one of us, as I have already commented, if their customers are corporate PDF generators, not end users like us who have unidentified PDFs arriving from all over the place and in any condition, we probably need PDF editors with sophisticated intelligence that have more of a chance of auditing a PDF for a variety of reasons, to convert to black-and-white, to make as small as possible, to perform OCR optical character recognition, to crop all scanned pages to the same dimensions around the content, to trim off marginalia in preparation for text export without page numbers, headers, and footers in the way of the text, and so on.

    Soft-Xpansion Perfect PDF 9 Editor is NOT it.

    Nor has the company overcome the first three failures:
    – failed to come on screen for some of us,
    – even in trial mode,
    – and the vendor has still not replied to our emails ( yet ).

    But, hey, [@Gary], if you want to discuss unicode and ignore a readily available properly formatted text file staring at you from the same web page where a generated PDF has worse text, go right ahead, none of that has anything to do with Soft-Xpansion Perfect PDF 9 Editor, or SOS, or me.

    PS – I have found a freely available epub version of “An Elegant Defense” book by Matt Richtel as released direct from the publisher, so my text export from the PDF is now just an exercise to be deleted.

    Thanks for exploring this and sharing.
    .

    #16588772 Reply | Quote
    Peter Blaise
    Guest

    And another alternative to check out FREE PDF Editor from SoftMaker:

    [ https :// www. getfreepdf. com/en/ ]

    You’re welcome.
    .

Viewing 15 posts - 76 through 90 (of 92 total)
Reply To: Reply #16437659 in Perfect PDF 9 Editor / Sep 1 2020