go ahead... be a heretic | |
PerlMonks |
|
Howdy, partner! Name's Apple Fritter, pleasure to meet y'all! I use Perl, but I don't know that much about it (yet). I'm trying to change that, so I frequent the Monastery, reading others' answers and code to learn, and providing my own answers and code to hone my skills.
If I come across useful advice, tips, modules, code snippets, articles etc., I usually add it to my home node (which you are reading right now) for future reference. Maybe you'll find it useful, too!
Not affiliated with Tom Owad's applefritter.com.
Note: I'm not active on Perlmonks anymore. I may still update my home node when I come across items worth adding.
For new users:
Introductions to the Monastery:
☞ |
|
☜ |
On civility/kindness:
You don't always have to agree with your companions on the road, but it certainly helps to be friendly if you disagree.
Always be civil. [...] Civility is simple: stick to the facts while avoiding demeaning remarks and sarcasm. It is not enough to be factual. You must also be civil. Responding in kind to incivility is not acceptable.
While civility is required, kindness is encouraged; if you have any doubt about whether you are being civil, simply ask yourself, "Am I being kind?" and aspire to that.
My main advice to everybody related to this is for one to only respond to questions where one has something helpful to offer in response and for which one is particularly suited to answer. [...]
If a question annoys you, then minimize your annoyance by immediately moving on to something more enjoyable for you. Please try to refrain from sharing your annoyance so that we all get to suffer from it. Most of you are probably even clever enough to figure out a lot of the questions that are likely to end up annoying you so you can avoid even clicking through to them in the first place.
If a question annoys everybody, then everybody will ignore it. The history of the internet says that's one of the best ways to end something. If the question doesn't annoy everybody, then we have a case of somebody asking a question and some others willingly answering the question via a web site. That sounds a lot like "success".
Introductions to Perl and resources for learning Perl:
Introductions, first steps and general information:
Tutorials:
Best practices and other information:
FAQs:
Books:
For non-IT folks, e.g. biologists:
There's also many books dedicated to specific topics such as Perl/Tk, DBI, Perl and ☞XML, ☞CGI programming with Perl, and much much more; see Perl Reference Materials: Books for an (outdated) list.
Other lists and resources:
Reviews, opinions etc.:
Asking questions (on Perlmonks and elsewhere):
How to ask questions (based on ww, Re: Replace key pair value from one to other file):
Asking questions effectively:
Formatting your write-up:
Other places to get help:
Other places learn about Perl:
N.B. when crossposting to several sites, it is considered polite to inform readers of this and provide links to avoid unnecessary/duplicated effort.
For established users:
Combinatorics:
Daemons:
Databases:
Also see ☞Unicode flags for database drivers further down.
Data munging:
Data structures:
Date/time manipulation:
Articles:
Parsing:
Time zone conversion:
Debugging:
Design patterns:
Distros, packages etc. (e.g. for Windows users):
Email:
Addresses:
Errors / Warnings:
*CORE::GLOBAL::die = sub { require Carp; Carp::confess };
eval / Exceptions:
External commands:
File input/output:
Input:
Output:
File names:
Graphs (the mathematical kind):
General information/further reading:
Useful modules:
External tools:
Graphs, charts and plots:
Modules:
HTML:
Parsing:
General tips:
Articles:
Modules:
List processing:
Logic:
Logging:
Math:
Basic arithmetic:
Large numbers:
Marshalling/serialization:
MediaWiki:
Bots/API:
OOP (object-oriented programming):
Operators:
Optimization:
[O]ptimising an algorithm may actually consist in optimising its underlying data structures. Obvious? Yes, but still worth a reminder now and then.
Efficiency gains should be targeted based on real world profiling and not based on review of code that "looks slow" as you will waste a ton of time lost in the details, chasing down non-existent performance problems, sometimes making things worse if you fight the compiler, and missing the big issues which are usually more fundamental to the design and data structure usage or locking in the hottest paths of the application.
Option processing:
References:
Any design problem can be solved by adding an additional level of indirection, except for too many levels of indirection.
Regular expressions, parsing and grammars:
HOWTOs, tutorials etc.:
Debugging regular expressions:
In-depth documentation and references:
Books:
Useful CPAN modules:
Misc.:
Security:
Signals:
Sorting:
Idioms:
Articles, columns and essays:
HOWTOs
Useful CPAN modules:
Statistics (the mathematical kind):
Temporary files:
Text input/output:
Input:
Output:
Threads:
General:
Sharing data between threads:
UIs:
Unicode/UTF8:
HOWTOs, BCPs, tips and tricks:
The correct order of operations for working with encoded data (whether utf8 or any other encoding) is:
- Input
- Decode
- Operate
- Encode
- Output
If you don't decode your input you'll be comparing apples and elephants which is why your regex fails to match. However, if you do no operations on the data at all, then you can skip the middle three steps because your perl script in that case is just essentially a pipe between your input (eg. database) and your output (eg. web page).
[...] MySQL offers a "charset" named UTF8. Guess what, it's not UTF8. It's actually a synonym for UTF8MB3, which is MySQL's bizarre internal "UTF8 except we only allow 3 bytes per character" rule. If you actually need UTF8 you must upgrade to a very new version and explicitly ask MySQL for "UTF8MB4".
Anybody who has used MySQL before can guess what happens if you try to insert actual Unicode data (say, an HTML-ised comment your PHP blogging framework wants to store) into one of these UTF8 columns. Afraid to incur your wrath with an error you probably haven't handled correctly, MySQL will quietly truncate the string, removing everything from the offending codepoint onwards. [...]
Since this often leads to confusion, here are a few very clear words on how Unicode works in Perl, modulo bugs.
Perl strings can store characters with ordinal values > 255.
This enables you to store Unicode characters as single characters in a Perl string - very natural.
Perl does not associate an encoding with your strings.
... until you force it to, e.g. when matching it against a regex, or printing the scalar to a file, in which case Perl either interprets your string as locale-encoded text, octets/binary, or as Unicode, depending on various settings. In no case is an encoding stored together with your data, it is use that decides encoding, not any magical meta data.
The internal utf-8 flag has no meaning with regards to the encoding of your string.
Just ignore that flag unless you debug a Perl bug, a module written in XS or want to dive into the internals of perl. Otherwise it will only confuse you, as, despite the name, it says nothing about how your string is encoded. You can have Unicode strings with that flag set, with that flag clear, and you can have binary data with that flag set and that flag clear. Other possibilities exist, too.
If you didn't know about that flag, just the better, pretend it doesn't exist.
A "Unicode String" is simply a string where each character can be validly interpreted as a Unicode code point.
If you have UTF-8 encoded data, it is no longer a Unicode string, but a Unicode string encoded in UTF-8, giving you a binary string.
A string containing "high" (> 255) character values is not a UTF-8 string.
It's a fact. Learn to live with it.
Win32-specific issues:
Useful CPAN modules:
Fonts:
Scripts and tools:
Talks, articles, references, presentations and meditations:
Interesting questions and discussions:
Using utf8 in your script proper:
If working under the effect of the use utf8; pragma, the following rules apply:
/ (?[ ( \p{Word} & \p{XID_Start} ) + [_] ]) (?[ ( \p{Word} & \p{XID_Continue} ) ]) * /xThat is, a "start" character followed by any number of "continue" characters. Perl requires every character in an identifier to also match \w (this prevents some problematic cases); and Perl additionally accepts identfier names beginning with an underscore.
Variables:
WWW:
XML:
Misc.:
The Monastery:
Experience (XP) and levels:
History:
Monks:
Orders:
Write-ups:
The Chatterbox:
Misc.:
General infrastructure:
Modules / *PAN:
News etc.:
Perl culture:
General:
Cool JAPHs:
Golfing:
The Lighter Side of Perl Culture:
ACME:: modules:
Misc. (unordered, unsorted):
Due to the 64 KiB node size limit, this section now resides in AppleFritter's scratchpad.
Monk quotes:
Do not fear death, you will re-awaken to a world built with Perfect Perl 7 and no Python.
-- boftx, Re^3: Using die() in methodsthe moment you try to separate the physical construction of code -- kloc, function points, abstracts test quantities -- from the intellectual processes of gathering requirements; understanding work-patterns and flows; and imagining suitable, appropriate, workable algorithms to meet them; you do not have sufficient understanding of the process involved in code development to be making decisions about it.
-- BrowserUk, Re: Nobody Expects the Agile Imposition (Part VII): MetricsYou were unlucky in the sense that your program seems to have remained valid Perl even with all variables removed.
-- Corion, Re: [OneLiner] What am I doing wrong in my regex?I insist on being paid to use Windows products, sir!
-- Your Mother, Re^3: PerlWizard - A free wizard for automatic Perl software code generation using simple formsNo further rational discussion is possible here because I find your preferred style utterly abhorrent :)
-- BrowserUk, Re^3: Porting (old) code to something else
AppleFritter elsewhere:
Two monks sat together for lunch. The first monk said, "What do you see when you see me?" |
Posts by AppleFritter | ||||||||||||||||||||||||||||||||||||||||||||||||
|
(1-8) of 8 |