Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Using perl script to convert .html file to tab delimited file

by ssaahh (Initiate)
on Oct 01, 2020 at 09:56 UTC ( [id://11122405]=perlquestion: print w/replies, xml ) Need Help??

ssaahh has asked for the wisdom of the Perl Monks concerning the following question:

This is not a post about some debugging of code I have zero knowledge of perl.Recently I wanted to use a particular dictionary on my kindle ereader and I downloaded it from internet but the problem is that it is in .epub format which is not supported by kindle ereaders.But the dictionary is in 'dictionary format' and not a normal book .epub format.Whem I converted it to .mobi format by calibre it converted that dictionary .epub format to normal book .mobi format which means that that .mobi file is recognised by kindle as normal book and not dictionary. But I came across one post that dealt with the problem I have but it uses 2 script(one script using perl and other using pyhton) and I have not experience of programming.

url to that post:- https://www.mobileread.com/forums/showthread.php?p=2562381#post2562381

So,assuming that you have once went through that post I need help with how to actually use that dicthtml2tab.pl file and if you can help with this problem that would be very helpful

  • Comment on Using perl script to convert .html file to tab delimited file

Replies are listed 'Best First'.
Re: Using perl script to covert .html file to tab delimited file
by hippo (Bishop) on Oct 01, 2020 at 10:10 UTC

    Hello, ssaahh. Welcome to the Monastery.

    The script dicthtml2tab.pl is quite simple. It takes the input file as argument and writes to standard output. Therefore, assuming you are on a unix-like OS or at least using a standard shell, the steps to use it are:

    1. Download the file
    2. Make sure you can run it: chmod a+rx dicthtml2tab.pl
    3. Run it: ./dicthtml2tab.pl myinputfile > myoutputfile

    Note that the dicthtml2tab.pl script uses regex to parse HTML which is, at best, fragile. If this is a common enough task then it might be better for someone to write and publish a new script which uses a proper HTML parser instead. But that's for another day.


    🦛

      First of all,thanks for replying I use windows 10 so how to use it there.

        strawberry perl or perl via cygwin
Re: Using perl script to covert .html file to tab delimited file
by marto (Cardinal) on Oct 01, 2020 at 10:11 UTC

    It would probably make more sense to ask in that existing thread, since the author of these scripts posted them there. The first post includes a batch script which automates the process for you.

      I actually asked the same question on reddit kindle subreddit and mobileread.com,but no one responded.Therefore I asked it here.Atleast someone responded here.

        The last post in the thread you posted here is from 2013. Replying to the author if you have a question about something they did is usually a sensible first step.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://11122405]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others meditating upon the Monastery: (3)
As of 2024-04-23 06:07 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found