Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Cool Uses for Perl

( [id://1044] : superdoc . print w/replies, xml ) Need Help??

This section is the place to post your general code offerings -- everything from one-liners to full-blown frameworks and apps.

CUFP's
"Terminal Velocity", a better Linux terminal graphics demo
1 direct reply — Read more / Contribute
by cavac
on Feb 18, 2024 at 07:36

    Last week i released a simple graphics demo for the Linux terminal (Fun with terminal color).

    The low framerate and the mostly static graphics bothered me a bit. So, i , uhm did it yet again. Another demo, this time using Inline::CPP and massaged versions of tinyraytracer and tinyraycaster to provide some actual graphical content. As a matter of fact, Inline::CPP didn't work for my borrowed(*) code, and my understanding of CPP is a 20 years out of date. So i override the Inline::CPP RecDescent module to ignore my bugs. Hey, it's not production code, just a demo...

    As in the last demo, your Terminal needs to support full RGB colors and have a size of at least 270x60 in size (characters, not pixels). SDL is sort-of-optional this time; the demo will run without sound if it can't load SDL. And as said above, you'll need to install Inline::CPP as well.

    Here's the mercurial repository: https://cavac.at/public/mercurial/demos/terminalvelocity/

    And the YouTube video: https://www.youtube.com/watch?v=MWcuI2SXA-A. OBS and YT compression did munge the quality a bit, though. Probably my fault for not understanding the OBS settings...


    (*) "but with every intention of giving it back"

    PerlMonks XP is useless? Not anymore: XPD - Do more with your PerlMonks XP
Color die and warn messages
No replies — Read more | Post response
by Anonymous Monk
on Feb 15, 2024 at 18:50
    This scratched an itch for me, no guarantees.
    use warnings::colored; warn "warning"; # yellow system "non-existant-command"; # red say "test"; # none eval { die "caught" }; # none say $@; # none die "died"; # red
    And the implementation:
Fun with terminal color
2 direct replies — Read more / Contribute
by cavac
on Feb 09, 2024 at 18:10

    Yesterday i thought i might fix some bugs in my imagecat routine (printing images to the terminal). But somehow i, uhm, improved things a little too far. If you're now thinking "Oh no, he did it again!", i'm afraid, yes, i did.

    This time, i made a 3 minute animated demo with sound, running at (some) FPS in a Linux terminal.

    You can find a download of the current version as a tarball or you could clone my public mercurial repo if you prefer that. (Edit: Link to mercurial has changed/updated)

    The demo needs quite a few CPAN modules, i'm afraid, including the somewhat hard-to-install SDL bindings. (For SDL, you might have to change a checksum in the Alien::SDL Build.PL and ignore some test errors in SDL as well.) I'm using SDL for the sound output.

    Also, your Terminal needs to support full RGB colors /see the Imagecat - show color images in a terminal thread) and have a size of at least 270x60 in size (characters, not pixels).

    If you want to avoid the hassle of getting it to work on your system, you can watch the YouTube video instead.

    PerlMonks XP is useless? Not anymore: XPD - Do more with your PerlMonks XP
Streaming download from S3 compatible storage
1 direct reply — Read more / Contribute
by kikuchiyo
on Feb 08, 2024 at 10:51

    $subject came up at $work.

    The following minimal example appears to work:

    #!/usr/bin/perl use Modern::Perl '2021'; use Net::Amazon::S3::Client; # edit these my $aws_access_key_id = '...'; my $aws_secret_access_key = '...'; my $host = '...'; my $bucket_name = '...'; my $secure = 1; my $client = Net::Amazon::S3::Client->new ( host => $host, aws_access_key_id => $aws_access_key_id, aws_secret_access_key => $aws_secret_access_key, secure => $secure, retry => 1, ); my $bucket = $client->bucket( name => $bucket_name ); my $object = $bucket->object( key => $ARGV[0] ); $object->get_callback(sub { my $s = length($_[0]); print STDERR "Got chunk size $s\n"; # do something with $_[0] });

    This is applicable if you want to serve a file from an S3 compatible storage via an async backend.

    The get_callback method is not documented (it is mentioned in passing only in Net::Amazon::S3::Client::Object::Range), but in the end it works.

Shotgun.pl - Shoots Holes in Files
1 direct reply — Read more / Contribute
by BlueSquare23
on Jan 29, 2024 at 18:27

    Shotgun.pl

    Shoots Holes in Files

      Cyber weapon! For home defense purposes only!

    Have you ever had a file you just wanted to blast with a shotgun? Now you can!


    Can play audio files via aplay or mpv (tested on Ubuntu). Or use -quiet to run with no sound effects.

    #!/usr/bin/env perl # This script shoots holes in files. # Written by John R., Nov. 2023 use strict; use warnings; use lib "$ENV{HOME}/perl5/lib/perl5"; use Getopt::Long; use File::Slurp qw(read_file); use File::JSON::Slurper qw(read_json write_json); use File::Touch 0.12; use IPC::Cmd 'run_forked'; use String::ShellQuote; use Data::Dumper; use JSON; # Change as needed. Tested and should work with `aplay` and `mpv` on U +buntu. my $AUDIO_PLAYER = '/usr/bin/mpv'; sub usage { my $err_msg = shift; print "$err_msg\n" if $err_msg; print <<~EOF; Name: shotgun.pl - Shoots holes in files Usage: shotgun.pl [options] Options (required, at least one): -target File you want to shoot holes in -reload Reload magazine file -check Check the mag Options (optional): -help Print this help menu -type [double|pump] Shotgun type -load [bird|buck|slug] Type of ammunition -shots [int] Number of shots to fire or load -quiet Mute sound effect -debug Debug mode, takes no action -verbose Verbose mode, more verbose output Defaults: -type double -load bird EOF exit; } my $MAG_FILE = 'mag.txt'; my %O = ( debug => 0, verbose => 1, type => 'double', load => 'bird', ); GetOptions(\%O, 'help', 'debug', 'verbose!', 'target=s', 'type=s', 'load=s', 'shots=i', 'reload', 'check', 'quiet', ) or usage(); $O{verbose}++ if $O{debug}; $O{verbose} = 0 if $O{quiet}; # Sanity checks. usage() if $O{help}; usage("Missing required argument!") unless $O{target} or $O{reload} or + $O{check}; usage("Invalid shotgun type!") unless($O{type} =~ /double|pump/); usage("Invalid ammo type!") unless($O{load} =~ /bird|buck|slug/); unless (-e $MAG_FILE) { print "Mag file not found, reloading...\n" if $O{verbose}; touch($MAG_FILE); reload(); } reload() if $O{reload}; check() if $O{check}; exit unless $O{target}; die("Target file does not exits!") unless -e $O{target}; die("Target file must be plain text!") unless -f $O{target}; die("Target file must be under 1 GB!") if -s $O{target} > (1024 * 1024 +); my $MAG = read_json($MAG_FILE) or die("Problem reading mag file!"); unless ($MAG->{$O{type}}) { print "Mag for $O{type} not loaded, you'll need to reload!\n"; exit; } if ($MAG->{$O{type}}->{num_rounds} == 0) { print "Mag empty, you'll need to reload!\n"; exit; } while ($MAG->{$O{type}}->{num_rounds} > 0) { shoot(); $MAG->{$O{type}}->{num_rounds}--; write_json($MAG_FILE, $MAG); } exit; ## Subroutines sub reload { my $num_rounds = 2; $num_rounds = 5 if $O{type} eq 'pump'; $num_rounds = $O{shots} if $O{shots} and $O{shots} < $num_rounds; my %load = ( 'num_rounds' => $num_rounds, 'load' => $O{load}, ); my %full_load = ($O{type} => \%load); write_json($MAG_FILE, \%full_load); unless ($O{quiet}) { for (my $i = 0; $i < $num_rounds; $i++) { print "Loading shot $i\n" if $O{verbose}; run_forked(join(" ", $AUDIO_PLAYER, "$O{type}_reload.wav") +); } } print "Shotgun reloaded!\n"; check() if $O{verbose}; } sub shoot { return if $O{debug}; my @lines = read_file($O{target}); # We're only going to work in this space. my $height = @lines; my $width = 80; my $v_buffer = int rand($height); my $h_buffer = int rand($width); my $v_spread = 7; my $h_spread = 13; my $r = int rand(3); for (my $v=0; $v < $v_spread; $v++) { my $v_offset = $v_buffer + $v; last if $v_offset >= $height; my @line = split '', $lines[$v_offset]; for (my $h=0; $h < $h_spread; $h++) { my $h_offset = $h_buffer + $h; # Belt and suspenders. last if $h_offset >= @line; last if $line[$h_offset] eq "\n"; if ($O{load} eq 'buck') { my %pattern0 = (0=>[6,7], 1=>[1,2,6,7], 2=>[1,2,11,12], 3=>[6,7,11,12], 4=>[1,2,6,7], 5=>[1,2,9,10], 6=>[9,10]); my %pattern1 = (0=>[1,2,9,10], 1=>[1,2,9,10], 2=>[5,6], 3=>[1,5,6,10,11], 4=>[1,2,10,11], 5=>[6,7], 6=>[6,7]); my %pattern2 = (0=>[5,6,7], 1=>[1,2,6,10,11], 2=>[1,2,10,11], 3=>[5,6,7], 4=>[1,2,6], 5=>[1,2,10], 6=>[9,10]); my %buck = (0 => \%pattern0, 1 => \%pattern1, 2 => \%pattern2); $line[$h_offset] = " " if grep {$_ == $h} @{$buck{$r}- +>{$v}}; } elsif ($O{load} eq 'slug') { my %pattern0 = (0=>[5,6,7], 1=>[5,6]); my %pattern1 = (0=>[5,6], 1=>[5,6,7]); my %pattern2 = (0=>[5,6], 1=>[4,5,6]); my %slug = (0 => \%pattern0, 1 => \%pattern1, 2 => \%pattern2); $line[$h_offset] = " " if grep {$_ == $h} @{$slug{$r}- +>{$v}}; } else { my %pattern0 = (0=>[6], 1=>[3,9], 2=>[6], 3=>[3], 4=>[1,6,10], 5=>[4], 6=>[0,7]); my %pattern1 = (0=>[6], 1=>[3,9], 2=>[6,11], 3=>[3,7,9], 4=>[6,10], 5=>[4,9], 6=>[7,11]); my %pattern2 = (0=>[6,9], 1=>[2,4,7], 2=>[5,9], 3=>[1,7], 4=>[6], 5=>[3,6,9], 6=>[5]); my %bird = (0 => \%pattern0, 1 => \%pattern1, 2 => \%pattern2); $line[$h_offset] = " " if grep {$_ == $h} @{$bird{$r}- +>{$v}}; } $lines[$v_offset] = join('', @line); } } open my $fh, '>', $O{target} or die("Unable to open target file!") +; foreach (@lines) { print $fh $_; } close $fh; # For dramatic effect. print "POW!\n" and return if $O{quiet}; run_forked(join(" ", $AUDIO_PLAYER, "$O{type}.wav")); print "POW!\n"; run_forked(join(" ", $AUDIO_PLAYER, "shot.wav")); } sub check { $MAG = read_json($MAG_FILE) or die("Problem reading mag file!"); print JSON->new->ascii->pretty->encode($MAG); }
    Source on My Github Video of Script in Action
Acceleration ETA algorithm
3 direct replies — Read more / Contribute
by phizel
on Jan 27, 2024 at 12:49
    Was looking for a decent algorithm for determining the ETA of a long-running process, but everything on CPAN uses simplistic and inaccurate algorithms. Found this great article Benchmarking I/O ETA algorithms and converted the Acceleration algorithm to perl. And yes, it would be better to extract the state components into an object.
    use Time::HiRes qw(time); sub eta { my ($cur, $total, $time) = @_; return unless $cur and $time; state ($last_progress, $last_time, $last_v, $last_eta); state (@v, @eta, $window_size, $window_idx); state $init = do { ($last_progress, $last_time, $last_v, $last_eta) = (0, 0, 0, - +1); ($window_size, $window_idx) = (10, 0); }; state $sub_v_weight = sub { 1 + $_[0] }; state $sub_eta_weight = sub { $_[0] ? 2 * $_[1] : 1 }; state $sub_weighted_avg = sub { my ($sub_weight, $avg, $total_weight, $w) = (shift, 0, 0, 0); for my $i (0 .. $#_) { # first version messed up the index. my $j = ($i + @_ - $window_idx - 1) % @_; $w = $sub_weight->($j, $w); $avg += $w * $_[$i]; $total_weight += $w; } return $avg / $total_weight; }; my $v = ($cur - $last_progress) / (($time - $last_time) || 1); $v[$window_idx] = $v; $v = $sub_weighted_avg->($sub_v_weight, @v); if ($v and $last_v) { my ($min_v, $max_v) = $v < $last_v ? ($v, $last_v) : ($last_v, + $v); $v = $last_v + ($v - $last_v) * $min_v / $max_v; } my $a = ($v - $last_v) / ($last_time ? ($time - $last_time) : 1); my $r = $total - $cur; my $eta = $last_eta; if ($a and 0 < (my $d = ($v * $v + 2 * $a * $r))) { $eta = (sqrt($d) - $v) / $a; } elsif ($v) { $eta = $r / $v } $eta[$window_idx] = $eta; $eta = $sub_weighted_avg->($sub_eta_weight, @eta); ($last_progress, $last_time, $last_v, $last_eta, $window_idx) = ($cur, $time, $v, $eta, ($window_idx + 1) % $window_size); return $eta > 0 ? $eta : 0; }
Munging file name, to be safe- & usable enough on Unix-like OSen & FAT32 file system
2 direct replies — Read more / Contribute
by parv
on Nov 25, 2023 at 23:12

    A program written in a hurry some time ago to munge file paths generally for file systems for Unix(-like) OSen & specifically for FAT32.

    Learned the hard way that NTFS would allow file names to be written to FAT32 even if some characters are outside of FAT32 specification. Problematic characters seemed to be en- & em-dash, fancy quotes, pipe, Unicode "?", & possibly few others (web pages saved with title as the file name). Mounting FAT32 file system on FreeBSD with specific codepage(s), or "nowin95" or "shortnames" mount options did not help (mount_msdosfs(8)). Munging it was then🤷🏽‍♂️

    # quick-sanename.pl use strict; use warnings; use feature qw[ state ]; use File::Copy qw[ move ]; @ARGV or die qq[Give a file name to sanitize.\n]; my $dry_run = 0; my $noisy = 1; my $lowercase = 1 ; my $for_windows = 1; my $clean_past_255 = 1; # General cleansing of base names. my %cleansed = run_cleanser( \&Cleanser::cleanse, @ARGV ); if ( $for_windows ) { if ( ! %cleansed ) { %cleansed = run_cleanser( \&Cleanser::cleanse_for_windows, @ARGV + ); } else { # Work on the changes of general cleansing. while( my ( $old, $once ) = each %cleansed ) { my $again = Cleanser::cleanse_for_windows( $once ) or next; $cleansed{ $old } = $again; } # Take care of those which were skipped during general cleansing +. my @todo = grep { ! exists $cleansed{ $_ } } @ARGV; my %win_cleansed = run_cleanser( \&Cleanser::cleanse_for_windows +, @todo ); %cleansed = ( %cleansed, %win_cleansed ); } } %cleansed or die qq[No new file names were generated.\n]; # Move file. for my $old ( sort keys %cleansed ) { my $new = $cleansed{ $old }; if ( $noisy || $dry_run ) { printf qq['%s' -> '%s'\n] , $old, $new; } $dry_run and next; if ( -e $new ) { warn qq[Skipped rename of "$old", "$new" already exists.\n]; next; } if ( ! move( $old, $new ) ) { warn qq[Could not move "$old" to "$new": $!\n]; } } exit; sub run_cleanser { my ( $clean_type, @old_path ) = @_; @old_path or return (); my %out; for my $one ( @old_path ) { my $new = $clean_type->( $one ) or next; $out{ $one } = $new; } return %out; } BEGIN { package Cleanser; use File::Basename qw[ fileparse ]; use File::Spec::Functions qw[ canonpath catfile ]; sub path_if_diff { my ( $old, $dir, $cleaned_base ) = @_; $lowercase and $cleaned_base = lc $cleaned_base; my $new = canonpath( catfile( $dir, $cleaned_base ) ); return $old ne $new ? $new : undef; } # Returns a cleaned path if possible; else C<undef>. # # Substitues various characters with "_" as minimally as possible. sub cleanse { my ( $old_path ) = @_; # Yes, I do mean to keep any word & digit in any writing script +(language). #state $alnum = 'a-zA-Z0-9'; state $alnum = '\w\d'; # quotemeta() does not escape "(" which causes warning that it w +ould be # deprecated in 5.30. state $left_brace = '\\{'; state $covered = q/}()[]/; state $meta = $left_brace . quotemeta( qq/${covered}@/ ); state $punc = q/-=,._/; my $no_keep = qq/[^${punc}${alnum}${meta}]+/; $no_keep = qr/$no_keep/u; state $punc_or = join( q/|/, $left_brace, map { split '', quotemeta $_ } ( $covered, +$punc ) ); state $many_seq = qr/[${punc}]{2,}/; state $pre_seq = qr/[${punc}]+_/; state $post_seq = qr/_[${punc}]+/; my ( $base, $dir ) = fileparse( $old_path ); for ( $base ) { s/$no_keep/_/g; # Collapse same. s/($punc_or)\1/$1/g; # Collapse any sequence. s/$pre_seq/_/g; s/$post_seq/_/g; s/$many_seq/_/g; } return path_if_diff( $old_path, $dir, $base ); } # Returns a cleaned path if possible; else C<undef>. # # It tries to keep a file path be a smaller set of characters for f +iles on # Microsoft Windows. # # Nothing is replaced, only a warning is issued for file names that + match ... # # CON, PRN, AUX, NUL, COM0, COM1, COM2, COM3, COM4, COM5, COM6, + COM7, # COM8, COM9, LPT0, LPT1, LPT2, LPT3, LPT4, LPT5, LPT6, LPT7, L +PT8, and # LPT9 # # See https://learn.microsoft.com/en-us/windows/win32/fileio/naming +-a-file that lists # ... # Use any character in the current code page for a name, includin +g Unicode # characters and characters in the extended character set (128–25 +5), except # for the following: # # The following reserved characters: # < (less than) # > (greater than) # : (colon) # " (double quote) # / (forward slash) # \ (backslash) # | (vertical bar or pipe) # ? (question mark) # * (asterisk) # # Integer value zero, sometimes referred to as the ASCII NUL # character. # # Characters whose integer representations are in the range +from 1 # through 31, except for alternate data streams where these +characters # are allowed # ... # Do not use the following reserved names for the name of a file: # # CON, PRN, AUX, NUL, COM0, COM1, COM2, COM3, COM4, COM5, COM6, + COM7, # COM8, COM9, LPT0, LPT1, LPT2, LPT3, LPT4, LPT5, LPT6, LPT7, L +PT8, and # LPT9. Also avoid these names followed immediately by an exten +sion; for # example, NUL.txt and NUL.tar.gz are both equivalent to NUL. F +or more # information, see Namespaces. # # Do not end a file or directory name with a space or a period. A +lthough # the underlying file system may support such names, the Windows +shell and # user interface does not. However, it is acceptable to specify +a period # as the first character of a name. For example, ".temp". # ... # sub cleanse_for_windows { my ( $old_path ) = @_; state $bad_char = q[<>:"|?*] . '\\' . join( q[], map { chr } 0..31 ) ; my %sub_replace = ( qr/[^\x00-\xff]+/ => q[^], q/(?:[.]|[ ]+)$/ => q[_], qq/[$bad_char]/ => q[-], ); my ( $base, $dir ) = fileparse( $old_path ); $base = prefix_windows_reserved( $base ); for ( $base ) { for my $found ( keys %sub_replace ) { my $repl = $sub_replace{ $found }; s{$found}{$repl}g; } } return path_if_diff( $old_path, $dir, $base ); } # Returns the base name prefixed with "_" if it matches a reserved +word. sub prefix_windows_reserved { my ( $base ) = @_; # Prefix with "_". state $prefix = q[_]; state $reserved = join( q[|], qw[ CON PRN AUX NUL COM0 COM1 COM2 COM3 COM4 COM5 COM6 COM7 +COM8 COM9 LPT0 LPT1 LPT2 LPT3 LPT4 LPT5 LPT6 LPT7 +LPT8 LPT9 ] ); state $regex = qq/^( (?: $reserved )(?:[.].+)? )\$/; $base =~ s{$regex}{$prefix$1}xi; return $base; } }
uparse - Parse Unicode strings
6 direct replies — Read more / Contribute
by kcott
on Nov 18, 2023 at 03:53

    Improvement: See "Re: Decoding @ARGV [Was: uparse - Parse Unicode strings]" for an improved version of the code; mostly thanks to ++jo37 and the subthread starting with "Re: uparse - Parse Unicode strings" and continued in "Decoding @ARGV [Was: uparse - Parse Unicode strings]".

    In the last month or so, we've had a number of threads where emoji were discussed. Some notable examples: "Larger profile pic than 80KB?"; "Perl Secret Operator Emojis"; and "Emojis for Perl Monk names".

    Many emoji have embedded characters which are difficult, or impossible, to see; for example, zero-width joiners, variation selectors, skin tone modifiers. In some cases, glyphs are so similar that it's difficult to tell them apart; e.g. 🧑 & 👨.

    I wrote uparse to split emoji, strings containing emoji, and in fact any strings with Unicode characters, into their component characters.

    #!/usr/bin/env perl BEGIN { if ($] < 5.007003) { warn "$0 requires Perl v5.7.3 or later.\n"; exit; } unless (@ARGV) { warn "Usage: $0 string [string ...]\n"; exit; } } use 5.007003; use strict; use warnings; use open IO => qw{:encoding(UTF-8) :std}; use constant { SEP1 => '=' x 60 . "\n", SEP2 => '-' x 60 . "\n", FMT => "%s\tU+%-6X %s\n", NO_PRINT => "\N{REPLACEMENT CHARACTER}", }; use Encode 'decode'; use Unicode::UCD 'charinfo'; for my $raw_str (@ARGV) { my $str = decode('UTF-8', $raw_str); print "\n", SEP1; print "String: '$str'\n"; print SEP1; for my $char (split //, $str) { my $code_point = ord $char; my $char_info = charinfo($code_point); if (! defined $char_info) { $char_info->{name} = "<unknown> Perl $^V supports Unicode +" . Unicode::UCD::UnicodeVersion(); } printf FMT, ($char =~ /^\p{Print}$/ ? $char : NO_PRINT), $code_point, $char_info->{name}; } print SEP2; }

    Here's a number of example runs. All use <pre> blocks; a very few didn't need this but I chose to go with consistency.

    Works with ASCII (aka Unicode: C0 Controls and Basic Latin)

    $ uparse X XY "X        Z"
    
    ============================================================
    String: 'X'
    ============================================================
    X       U+58     LATIN CAPITAL LETTER X
    ------------------------------------------------------------
    
    ============================================================
    String: 'XY'
    ============================================================
    X       U+58     LATIN CAPITAL LETTER X
    Y       U+59     LATIN CAPITAL LETTER Y
    ------------------------------------------------------------
    
    ============================================================
    String: 'X      Z'
    ============================================================
    X       U+58     LATIN CAPITAL LETTER X
    �       U+9      <control>
    Z       U+5A     LATIN CAPITAL LETTER Z
    ------------------------------------------------------------
    

    The two similar emoji heads (mentioned above)

    $ uparse 🧑 👨
    
    ============================================================
    String: '🧑'
    ============================================================
    🧑      U+1F9D1  ADULT
    ------------------------------------------------------------
    
    ============================================================
    String: '👨'
    ============================================================
    👨      U+1F468  MAN
    ------------------------------------------------------------
    

    A complex ZWJ sequence

    $ uparse 👨🏽‍✈️
    
    ============================================================
    String: '👨🏽‍✈️'
    ============================================================
    👨      U+1F468  MAN
    🏽      U+1F3FD  EMOJI MODIFIER FITZPATRICK TYPE-4
            U+200D   ZERO WIDTH JOINER
    ✈       U+2708   AIRPLANE
            U+FE0F   VARIATION SELECTOR-16
    ------------------------------------------------------------
    

    Maps

    $ uparse 🇨🇭
    
    ============================================================
    String: '🇨🇭'
    ============================================================
    🇨       U+1F1E8  REGIONAL INDICATOR SYMBOL LETTER C
    🇭       U+1F1ED  REGIONAL INDICATOR SYMBOL LETTER H
    ------------------------------------------------------------
    

    Handles codepoints not yet assigned; or not supported with certain Perl versions

    $ uparse `perl -C -e 'print "X\x{1fa7c}X"'`
    
    ============================================================
    String: 'X🩼X'
    ============================================================
    X       U+58     LATIN CAPITAL LETTER X
    🩼      U+1FA7C  CRUTCH
    X       U+58     LATIN CAPITAL LETTER X
    ------------------------------------------------------------
    
    $ uparse `perl -C -e 'print "X\x{1fa7c}X"'`
    
    ============================================================
    String: 'X🩼X'
    ============================================================
    X       U+58     LATIN CAPITAL LETTER X
    �       U+1FA7C  <unknown> Perl v5.30.0 supports Unicode 12.1.0
    X       U+58     LATIN CAPITAL LETTER X
    ------------------------------------------------------------
    
    $ uparse `perl -C -e 'print "X\x{1fa7d}X"'`
    
    ============================================================
    String: 'X🩽X'
    ============================================================
    X       U+58     LATIN CAPITAL LETTER X
    �       U+1FA7D  <unknown> Perl v5.39.3 supports Unicode 15.0.0
    X       U+58     LATIN CAPITAL LETTER X
    ------------------------------------------------------------
    

    Enjoy!

    — Ken

Introducing the C Perl-Powered Pre-Processor
1 direct reply — Read more / Contribute
by NERDVANA
on Nov 09, 2023 at 02:03

    For those developers who do both C and Perl, and frequently run into the upper limit of the C preprocessor, I have a treat for you!

    CodeGen::Cpppp

    It's still a little rough around the edges, and could use lots more features, but I think it's reached a point of usability where it's worth sharing.

Automate Outlook via Win32::OLE to extract PDFs from mails
1 direct reply — Read more / Contribute
by Corion
on Nov 02, 2023 at 04:46

    This is my somewhat generic framework to process mails in specific folders in Outlook. The concrete use case here is to find and save PDFs that haven't been processed yet.

    The script could also move mails or even reply to them, but the intention is to co-exist with human users of this shared mailbox, so the script scans several mail folders for files with an unknown name.

    For more information on the object model (and especially the MailItem and Folder class), see the MS Outlook object model.

    #!perl use 5.020; use feature 'signatures'; no warnings 'experimental::signatures'; use Getopt::Long; use utf8; use File::Basename 'dirname'; use File::Spec; use Win32::OLE 'in'; use Win32::OLE::Const 'Microsoft Outlook'; use Win32::OLE::Variant; use Scalar::Util 'blessed'; use Encode 'encode', 'decode'; use POSIX 'strftime'; #use PDFContents::Cache; # We output UTF-8 system('chcp 65001 >NUL:'); binmode STDOUT, ':encoding(UTF-8)'; local $| = 1; GetOptions( 'quick' => \my $quick_run, 'target-directory|t=s' => \my $target_dir, ); $target_dir ||= dirname($0) . "/INPUT"; my $target_dir = File::Spec->rel2abs( $target_dir ); my $outlook = Win32::OLE->GetActiveObject('Outlook.Application') || Win32::OLE->new('Outlook.Application', 'Quit'); my $namespace = $outlook->GetNamespace("MAPI"); #my $Folder = $namespace->GetDefaultFolder(olFolderInbox); # Output some folder names for debugging #for my $f (in($namespace->Folders->{"#MAGIC-MAILBOX"}->Folders->{Post +eingang}->Folders)) { # #say "Posteingang!" . $f->{Name}; #} sub progress( $info ) { state $last_progress; print join "", " " x length($last_progress), "\r", $info, "\r"; $last_progress = $info; } sub find_folder($path) { my $folder = $namespace->Folders->{"#MAGIC-MAILBOX"}; for my $el (split /!/, $path) { $el = encode('Latin-1', $el); my $next_folder = $folder->Folders->{$el}; if( ! $next_folder ) { warn "No folder found for '$el' in '$path'"; for( in($folder->Folders) ) { say "<$_->{Name}>"; }; }; $folder = $next_folder; }; return $folder; } # Read all PDFs we already rejected opendir my $dh, "$target_dir/rejected"; our @blacklist = readdir $dh; closedir $dh; # iterate over folders sub for_all_mails( $folder, $callback ) { if( ! $folder->Items ) { return 0; }; my $count; my $list = $folder->Items; my $msg = $list->GetFirst; while( $msg ) { $count += $callback->($msg); $msg = $list->GetNext; } return $count; } sub save_mail_attachments( $msg, $target_directory=$target_dir ) { foreach my $atch (reverse in($msg->{Attachments})) { my $name = $atch->{FileName}; if($name =~ m/.pdf$/i){ #say "Überspringe $_" if grep { $_ eq $name } @bla +cklist; next if grep { $_ eq $name } @blacklist; my $target = $target_dir . "/" . $name; if( ! -f $target or ! -s $target) { #$new++; $atch->SaveAsFile($target); } else { #say "Already exists ".$atch->{FileName}; } } } } sub save_attachments( $folder ) { progress($folder->Name); for_all_mails( $folder, \&save_mail_attachments ); } sub in_all_subfolders( $folder, $callback, $visual=$folder->Name ) { $callback->($folder); #for my $subfolder (grep { defined } $folder->Folders) { my $folders = $folder->Folders; my $subfolder = $folders->GetLast; while( $subfolder ) { in_all_subfolders( $subfolder, $callback, $visual . ' > ' . $s +ubfolder->Name ); $subfolder = $folders->GetPrevious; }; } my $count = 0; my $Folder = find_folder("Posteingang!incoming stuff"); #for my $f (in ($Folder->Folders)) { # say join "/", $Folder->{Name}, $f->{Name}; #}; # Find a folder named "from Somebody", but as a substring, since it mi +ght contain Umlauts or whatever for my $f (in ($Folder->Folders)) { #say join "/", $Folder->{Name}, $f->{Name}; if( $f->Name =~ m!from Somebody$! ) { $Folder = $f; last; }; }; $count += save_attachments( $Folder ); if( $quick_run ) { # nothing to do } else { in_all_subfolders( $Folder, sub( $this_folder ) { $count += save_attachments($this_folder); }); $count += save_attachments( find_folder("Posteingang")); $count += save_attachments( find_folder("Posteingang!to-sort")); $count += save_attachments( find_folder("Posteingang!to-sort-later +")); for my $folder (in(find_folder('Posteingang!in-progress')->Folders +)) { progress( $folder->Name ); $count += save_attachments( $folder ); } for my $folder (reverse in(find_folder('Posteingang!by-ticket-numb +er')->Folders)) { in_all_subfolders( $folder, sub( $this_folder ) { $count += save_attachments($this_folder); }); } } my $ts = strftime '%Y-%m-%dT%H:%M:%S', localtime; in_all_subfolders( find_folder("Posteingang!some!deep!subfolder"), sub +($folder) { my $foldername = $folder->{Name}; #progress($foldername); my $count; for_all_mails( $folder, sub( $msg ) { progress( "$foldername - $count" ); $count++; for my $att (reverse in($msg->{Attachments})) { my $id = $msg->{EntryId}; my $fn = $att->{FileName}; return unless $fn =~ /\.pdf\z/i; # process the PDF contents # PDFContents::Cache::add_mailinfo($foldername, $fn, $id, +$ts); } 1 }); }); progress(""); say "$count new PDFs found";
Fixing bad CSS in EPUB files
1 direct reply — Read more / Contribute
by jimhenry
on Sep 05, 2023 at 21:02

    Many epubs come with unprofessional CSS that will not display correctly on some ebook readers. For instance, the font size may be illegibly small on a mobile device, or the user may have dark mode turned on, but the CSS specifies element foreground colors according to an assumed (but not specified) white background, so there is little or no contrast with the actual black background. I recently wrote a script to detect epubs with those problems, then one to detect and fix them.

    My first attempt at this used EPUB::Parser, but I soon found that it didn't (as far as I could tell) have the functionality I needed to get at the internal CSS files and edit them. So I fell back on Archive::Zip (which EPUB::Parser uses) -- an epub is a zip file containing css, html, and xml files (and sometimes jpg's, etc.).

    The full code and assocated files
    The documentation

    Here, I present two of the trickier functions; inverse_color() is passed a CSS color value of some kind (which can be a wide array of formats), calculates a complementary color, and returns it. It makes use of functions from Graphics::ColorUtils to map CSS color names to rgb values. It is called by fix_css_colors() when it finds a CSS block containing a color: attribute but no background-color: attribute.

    sub inverse_color { my $color = shift; die "Missing argument to inverse_color()" unless $color; state $color_names; if ( not $color_names ) { #set_default_namespace("www"); $color_names = available_names(); } $color =~ s/^\s+//; $color =~ s/\s+$//; if ( $color =~ /^#[[:xdigit:]]{3}$/ ) { $color =~ s/#//; my $n = hex $color; my $i = 0xFFF - $n; my $inverse = sprintf "#%03x", $i; return $inverse; } elsif ( $color =~ /^#[[:xdigit:]]{6}$/ ) { $color =~ s/#//; my $n = hex $color; my $i = 0xFFFFFF - $n; my $inverse = sprintf "#%06x", $i; return $inverse; } elsif ( $color =~ /rgb \s* \( \s* ([0-9]+) \s* , \s* ([0-9]+) , +\s* ([0-9]+) \s* \) /x ) { my ($r, $g, $b) = ($1, $2, $3); my $n = $r * 65536 + $g * 256 + $b; printf "converted %s to %06x\n", $color, $n if $verbose; my $i = 0xFFFFFF - $n; my $inverse = sprintf "#%06x", $i; return $inverse; } elsif ( $color =~ /rgba \s* \( \s* ([0-9]+) \s* , \s* ([0-9]+) , + \s* ([0-9]+) \s* , \s* ([0-9.]+) \s* \) /x ) { my ($r, $g, $b, $alpha) = ($1, $2, $3, $4); my $inverse = sprintf "rgba( %d, %d, %d, %0.2f )", 255 - $r, 255 - + $g, 255 - $b, 1 - $alpha; return $inverse; } elsif ( $color =~ /hsl \s* \( \s* ([0-9]+) \s* , \s* ([0-9]+)% +, \s* ([0-9]+)% \s* \) /x ) { my ( $hue, $saturation, $lightness ) = ($1, $2, $3); my $hue2 = ($hue + 180) % 360; my $sat2 = 100 - $saturation; my $light2 = 100 - $lightness; my $inverse = sprintf "hsl( %d, %d%%, %d%% )", $hue2, $sat2, $ligh +t2; return $inverse; } elsif ( $color =~ /hsla \s* \( \s* ([0-9]+) \s* , \s* ([0-9]+)% + , \s* ([0-9]+)% \s* , \s* ([0-9.]+) \s* \) /x ) { my ( $hue, $saturation, $lightness, $alpha ) = ($1, $2, $3, $4); my $hue2 = ($hue + 180) % 360; my $sat2 = 100 - $saturation; my $light2 = 100 - $lightness; my $alpha2 = 1 - $alpha; my $inverse = sprintf "hsl( %d, %d%%, %d%%, %0.2f )", $hue2, $sat2 +, $light2, $alpha2; return $inverse; } elsif ( $color =~ /currentcolor/i ) { warn "Should have removed currentcolor in fix_css_colors()"; } elsif ( $color =~ /inherit/i ) { return "inherit"; } elsif ( $color_names->{ "www:". $color} or $color_names->{ $colo +r} ) { my $hexcolor = name2rgb( $color ); if ( not $hexcolor ) { $hexcolor = name2rgb( "www:" . $color ); if ( not $hexcolor ) { die "Can't resolve color name $color"; } } $hexcolor =~ s/#//; my $i = 0xFFFFFF - hex($hexcolor); my $inverse = sprintf "#%06x", $i; return $inverse; } else { die "Color format not implemented: $color"; } } sub fix_css_colors { my ($csstext, $css_fn, $epub_fn) = @_; return if not $csstext; my $errors = 0; my $corrections = 0; my $printed_filename = 0; say "Checking $epub_fn:$css_fn for bad colors\n" if $verbose; # this might be a good use of negative lookbehind? my @css_blocks = split /(})/, $csstext; for my $block ( @css_blocks ) { if ( $block =~ m/color: \s* ( [^;]+ ) \s* (?:;|$) /x ) { my $fgcolor = $1; print "found color: $fgcolor\n" if $verbose; if ( $fgcolor =~ m/currentcolor/i ) { $block =~ s/(color: \s* currentcolor \s* ;? \s* ) \n* //xi; print "Stripping out $1 as it is a pleonasm\n" if $verbose; $corrections++; next; } if ( $block !~ m/background-color:/ ) { my $bgcolor = inverse_color( $fgcolor ); $block =~ s/(color: \s* [^;}]+ \s* (?:;|$) )/background-color: + $bgcolor;\n$1/x; print "corrected block:\n$block\n}\n" if $verbose; $corrections++; } } } if ( $corrections ) { my $new_css_text = join "", @css_blocks; return $new_css_text; } else { return undef; } }
A podcatcher in Perl
5 direct replies — Read more / Contribute
by jimhenry
on Sep 05, 2023 at 20:39

    A while ago I wrote a podcatcher in Perl. In the last few days I've finally gotten around to cleaning it up a bit, finishing the documentation, and getting it out where people can use it (on my website for now -- maybe I'll try to submit it to CPAN at some point).

    The full code (and associated files) can be found at http://jimhenry.conlang.org/software/podcatcher.zip and the documentation (including per-function summaries) at http://jimhenry.conlang.org/software/podcatcher.html

    Here, I'll just briefly discuss one of the functions that gave me some trouble, given the variety of podcast RSS feeds out there and how weirdly (sometimes invalidly) formatted some of them are.

    This function is passed an RSS feed as a single string and attempts to extract the podcast episode URLs from it. First it tries to parse the RSS using XML::RSS::LibXML. Then, if that worked, it tries to find episodes in <enclosure> tags, then if that fails, it tries looking in <media:content> tags. If it failed to parse the RSS file, or if it parsed and failed to find any podcasts in the appropriate tags, it does a brute force regular expression match on the whole RSS file to find anything that starts with http and ends with one of the file extensions we're looking for (which is configurable).

    sub get_mp3_links_from_string { my $pagecontent = shift; my @episodes; my $parser = XML::RSS::LibXML->new; # for some bizarre reason, putting curly brackets around this eval + generates # syntax errors. use q// instead. eval q/ $parser->parse($pagecontent) /; if ( $@ ) { writelog "Could not parse page as XML/RSS: $@\n"; $parser = undef; } if ( $parser ) { foreach my $item (@{ $parser->{items} }) { my $ep; if ( defined $item->{enclosure} ) { if ( $ep = $item->{enclosure}{url} and $ep =~ m!$extension_reg +ex$! ) { push @episodes, { url => $ep }; } elsif ( $ep = $item->{media}{content}{url} and $ep =~ m!$ext +ension_regex$! ) { push @episodes, { url => $ep }; } next if not $ep; } else { next; } if ( $config{description} ) { $episodes[ $#episodes ]->{title} = $item->{title}; $episodes[ $#episodes ]->{description} = $item->{description}; } } # end for each <item> } # end if we have a valid parse unless ( @episodes ) { writelog "Found no $config{extensions} files by parsing XML, check +ing via regex for any $config{extensions} links in any context\n"; my @mp3s = uniq ( $pagecontent =~ m/(http[^\s>]+$extension_re +gex)/gi ); return undef unless ( @mp3s ); foreach ( @mp3s ) { push @episodes, { url => $_ }; } } return \@episodes; # @mp3s; }
MCE Sandbox 2023-08
2 direct replies — Read more / Contribute
by marioroy
on Aug 28, 2023 at 02:03

    The MCE Sandbox repository is where I try writing fast code using Perl MCE + Inline::C, Math::Prime::Util, and the C/C++ libprimesieve library. The demos and examples folders are new for the 2023 update. I learned Codon, a Python-like language that compiles to native code.

    .Inline/ Where Inline::C is configured to cache C object file +s. bin/ algorithm3.pl Practical sieve based on Algorithm3 from Xuedong Luo + [1]. primesieve.pl Calls the primesieve.org C API for generating primes +. primeutil.pl Utilizes the Math::Prime::Util module for primes. demos/ primes1.c Algorithm3 in C with OpenMP directives. primes2.codon Algorithm3 in Codon, a Python-like language. primes3.c Using libprimesieve C API in C primes4.codon Using libprimesieve C API in Codon examples/ Progressive demonstrations. practicalsieve.c single big loop segmentsieve.c segmented variant, faster rangesieve.c process range; start stop prangesieve.c parallel rangesieve in C cpusieve.codon parallel rangesieve in Codon (CPU) gpusieve.codon parallel rangesieve in Codon (GPU) pgpusieve.codon using Codon @par(gpu=True) syntax cudasieve.cu using NVIDIA CUDA Toolkit lib/ Sandbox.pm Common code for the bin scripts. CpuAffinity.pm CPU Affinity support on Linux. src/ algorithm3.c Inline::C code for algorithm3.pl. bits.h Utility functions for byte array. output.h Fast printing of primes to a file descriptor. primesieve.c Inline::C code for primesieve.pl. sandbox.h Header file, includes bits.h, output.h, sprintull.h. sprintull.h Fast base10 to string conversion. typemap Type-map file for Inline::C.
Sending items to the windows recycle bin
1 direct reply — Read more / Contribute
by CrashBlossom
on Aug 11, 2023 at 16:08

    Not very sexy, but it some may find it useful.

    The following code was tested on Window 11 using the 64-bit version of strawberry 5.30.3. It was assembled by extracting the relevant bits from the Win32::FileOp module and making a simple change to account for the fact that I am using a 64-bit version of perl.

    use strict; use warnings; use Win32::API; sub FO_DELETE () { 0x03 } sub FOF_SILENT () { 0x0004 } # don't create progress/report sub FOF_NOCONFIRMATION () { 0x0010 } # Don't prompt the user. sub FOF_ALLOWUNDO () { 0x0040 } # recycle bin instead of delete sub FOF_NOERRORUI () { 0x0400 } # don't put up error UI sub Recycle { # a series of null-terminated pathnames, with a double null at the e +nd my $paths = join "\0", @_, "\0"; my $recycle = new Win32::API('shell32', 'SHFileOperation', 'P', 'I') +; my $options = FOF_ALLOWUNDO | FOF_NOCONFIRMATION | FOF_SILENT | FOF_ +NOERRORUI; # for everything except paths and options, pack with Q (rather than +L), since we're using 64-bit perl # my $opstruct = pack ('LLpLILLL', 0, FO_DELETE, $paths, 0, $options +, 0, 0, 0); my $opstruct = pack ('QQpQIQQQ', 0, FO_DELETE, $paths, 0, $options, +0, 0, 0); return $recycle->Call($opstruct); } my $file = "C:\\Users\\James\\fish"; my $rc = Recycle($file); print "RC: $rc\n";

    Return codes are described here:

    https://learn.microsoft.com/en-us/windows/win32/debug/system-error-codes--0-499-

Imagecat - show color images in a terminal
5 direct replies — Read more / Contribute
by cavac
on Jun 28, 2023 at 08:26

    A few days ago a played around with displaying (color) ASCII art in a Terminal in Re: 80x25 ASCII text art in terminal, because harangzsolt33 peaked my interest. i mentioned that it should be possible to display low res color images in the text console as well and that i would look into it if someone was interested.

    Turns out, the first interested party was myself. Literally a couple of hours after i posted, i had to sort through some PNG icons through an SSH connection. "Instead of downloading the folder, opening the files locally and finding the correct icon, wouldn't it be nice to just display a low res version in my terminal?". Yes, i know there are technically a few other tools that can already do this. But i decided i wanted a Perl version, so that i can easily customize it to my liking. I wanted to build it in a way that it ONLY uses very basic ANSI colors, to support as many color terminals as possible (and as long as they support Unicode).

    So, i created imagecat:

    Had a slight problem posting the original code to PerlMonks. The while @shades initialization is a single line in my original code, but PM refused to show Unicode in code tags. Basically, this is what it should look like (that is, unless there are more PM rendering bugs):

    my @shades = (' ', '░', '▒', '▓', '█');
    

    Yes, this could be improved with using full RGB colors and 2 "pixels" per character using something like 'Upper half block ▀' for a higher resolution. But for now, i just wanted to learn if i can do a version with much more basic color support. HSV color mapping is a strange beast... Edit: I wrote the full color, double-vertical resolution imagecat2, see my post below.

    PerlMonks XP is useless? Not anymore: XPD - Do more with your PerlMonks XP

Add your CUFP
Title:
CUFP:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post, it's "PerlMonks-approved HTML":


  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.