Index or iterate - your choice

GrandFather has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Index or iterate - your choice by Discipulus (Canon) on Jan 27, 2021 at 07:54 UTC
Hello GrandFather, my 2 cents as I dont know at all the matter of ELF files, so a rubber duck service from me. Why index? The `GetSegments` returns an arrayref but then you access it by index using another call to `SegmentCount` and I expected something like `foreach my $segment ( @{$elfFile->GetSegments()} ) ..` If the data is already stored inside the object, so is not huge data, personally I find returning them all via `GetSegments` simpler to understand and use. If, by other hand, the data is bigger and you parse it live the iterator make much more sense. So for me if the data will always fit already inside the object then provide it in a whole via `GetSegments` and stop.Only if the data can be bigger and you dont precompute it in advance the iterator make sense as alternative. Basically the problem can be reduced to: `@lines = <$handle>` as opposite of `while (<$handle>)` with the second more idiomatic and memory safe, but if you have `@lines` already filled the iterator makes little sense for me. L* There are no rules, there are no thumbs.. Reinvent the wheel, then learn The Wheel; may be one day you reinvent one of THE WHEELS.	[reply] [d/l] [select]
Re^2: Index or iterate - your choice by GrandFather (Saint) on Jan 27, 2021 at 21:01 UTC
That helps. Thanks teddy bear (or rubber duck, as the case may be). I was overthinking the plumbing so the "index" variant at least can be much simpler by returning a list as implied by your comments and stated explicitly by tobyink. In a typical ELF file the number of entries is small and the size of the entries is small and fixed so there is no issue returning a list. Optimising for fewest key strokes only makes sense transmitting to Pluto or beyond	[reply]
Re: Index or iterate - your choice by tobyink (Canon) on Jan 27, 2021 at 19:46 UTC
Generally speaking, if you want to provide a list of things, just return a list. People can assign it to an array and loop through it, access a particular element by index, etc. The exceptions would be where there are so many list items or the items are so big, that they would use too much memory to store in an array, so accessing them one by one is better; or if generating each item is relatively expensive (in terms of time, CPU, network activity, etc) so if you can avoid fetching the entire list, that is preferable. In these cases use an iterator instead. toby döt ink	[reply]
Re^2: Index or iterate - your choice by thomas895 (Deacon) on Jan 27, 2021 at 21:12 UTC
where there are so many list items or the items are so big, that they would use too much memory to store in an array Perhaps tie-ing is another good option for this. Then you can get all the semantics of an array for free! (from the user perspective at least) -Thomas "Excuse me for butting in, but I'm interrupt-driven..."	[reply]
Re^3: Index or iterate - your choice by tobyink (Canon) on Jan 27, 2021 at 23:17 UTC
I did consider including that, but by providing a tied array, you're kind of encouraging end users to treat it as any old array. So they might not consider that doing something like: `foreach my $item ( reverse @array ) { ... }` [download] Is going to impact performance way more than they might expect. If it's exposed as an iterator, then it encourages them to access items in a one-at-a-time sequential fashion. They still can slurp it all into an array, but they can't blame you when that eats up all their memory. Not saying it's never a good idea, but situations where it is aren't going to be that common. toby döt ink	[reply] [d/l]
Re^2: Index or iterate - your choice by GrandFather (Saint) on Jan 27, 2021 at 21:19 UTC
After I posted it I wondered if my question was too trivial to post. But it turned out to be a great sanity check! I'm now returning a list as suggested. Thank you, Discipulus and jdporter for shining a spot light on this! Optimising for fewest key strokes only makes sense transmitting to Pluto or beyond	[reply]
Re: Index or iterate - your choice by bliako (Monsignor) on Jan 27, 2021 at 10:18 UTC
I am not acquainted with the ELF and what functionality a potential user may desire from your module. So I will risk some general points. Filtering (the ELF segments) can be important and could even expand to checks other than size>0. In both `get()` methods you do this filtering by hand, inside the loop. If indeed there is scope for filtering useless segments and selecting "interesting" (hmmmm!!) segments then I would re-pose the problem. It looks to me that such a functionality would be very appealing. But is the array or the iterator suited for this best? Usually when filtering, one expects the result collection to be of the same type as the input, e.g. an array or iterator (Edit: and even offer in-place editing). Neither looks suitable though for filtering! Splicing an array? Linkedlist? There is also the question of passing the segments, array or iterator, to another sub/module for further processing, filtering, profiling etc. I would think an arrayref is the least common denominator here. Unless that other module is yours and you control the API. But there is also the problem of creating new iterators to return back after filtering and selecting. Possibly pipelined. Would (not-)garbage-collecting the iterators be a problem? As I said I am ignorant about ELF, but perhaps it makes sense to store the segments in a linkedlist/graph/tree? Perhaps for adding/removing a new segment and then writing out the ELF?	[reply] [d/l]
Re^2: Index or iterate - your choice by GrandFather (Saint) on Jan 27, 2021 at 21:46 UTC
Nice! I hadn't thought of filtering returned results. To me the iterator feels like a pretty natural way to do that. As it stands the module is a reader so the semantics for the collection is read only. However changing the module to an ELF manager is an attractive thought. It impinges on ELF::Writer's solution space, but the APIs for my module and NERDVANA's module are quite different so maybe creating a "one size fits all" module makes sense? Optimising for fewest key strokes only makes sense transmitting to Pluto or beyond	[reply]
Re: Index or iterate - your choice by jdporter (Paladin) on Jan 27, 2021 at 14:29 UTC
It looks like `GetSegments` returns an array ref. If so, then `SegmentCount` is superfluous, no? `my $segments = $elfFile->GetSegments(); for my $segIndex (0 .. #$segments) { $segments->[$segIndex]->FileSize() or next; print $segments->[$segIndex]->Describe(head => 16, tail => 16, wid +th => 32) };` [download] But even the index is unnecessary (unless you actually need the number for some reason): `print $_->Describe(head => 16, tail => 16, width => 32) for grep $_->FileSize, @$segments;` [download] If `GetSegments` doesn't in fact return an array ref, then make a method that does. :-) It could even be a tied array for more perlishness.	[reply] [d/l] [select]
Re^2: Index or iterate - your choice by GrandFather (Saint) on Jan 27, 2021 at 21:14 UTC
Actually `GetSegments()` was returning the internal `ProgramHeader` object which had an overload for `@{}`. But that was overthinking the plumbing so I've changed it to return a list. Optimising for fewest key strokes only makes sense transmitting to Pluto or beyond	[reply] [d/l] [select]
Re: Index or iterate - your choice by jcb (Parson) on Jan 28, 2021 at 02:19 UTC
ELF structural tables tend to be relatively small, so simply returning a list (as other monks have suggested and you seem to have chosen) is probably the best solution. Further, the records themselves (excluding the actual contents) are of very limited and finite size. I would suggest "splitting the difference" and returning a list of objects that describe the table entries fully and carry internal file references for the actual data. Producing a tied filehandle "on demand" that reads the extent for the actual data is not difficult; you need only remember the position and length and enforce the artificial EOF.	[reply]
Re^2: Index or iterate - your choice by GrandFather (Saint) on Jan 28, 2021 at 02:47 UTC
As mentioned above the various replies have stimulated me into returning lists for segments and sections which, as you point out are small in number and small in size. However I'm going to provide an iterator for returning the contents of segments and sections. Parameters to the iterator generator will let me set the maximum size of blobs returned and allow specifying selected portions of the blocks to be returned. That fits nicely with common use patterns where fixed size blocks are consumed by processes such as flash programmers and debuggers. That also helps with code that might write files for consumption by flash programmers and debuggers etc. Methods that return lists or iterators can both benefit from providing filtering so I'll roll that in as an option too. Any other kitchen sinks I should add? Optimising for fewest key strokes only makes sense transmitting to Pluto or beyond	[reply]
Re^3: Index or iterate - your choice by jcb (Parson) on Jan 29, 2021 at 02:20 UTC
I would still advocate using a tied filehandle for reading contents, since that is effectively a built-in iterator interface for byte streams.	[reply]


good chemistry is complicated, and a little bit messy -LW
	PerlMonks