Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

convert last small purpose of builtin.pm to C and NOOP require's I/O #22699

Open
wants to merge 2 commits into
base: blead
Choose a base branch
from

Conversation

bulk88
Copy link
Contributor

@bulk88 bulk88 commented Oct 24, 2024

See commit text. Embedding modern Perl's near universal .pm'es/pragmas into static XS/C to avoid parsing/IO, is advantageous for all perl CI everywhere. builtin.pm was very easy to do, b/c it already was 99% static XS. And its part of the .pm dep tree of -E"".

-E say is supposed to be shorter to type, but whats the point if it requires typing a -I../lib every time for core hacking.

More philosophically, I want the -E"say();" from my first -e"print();" many years ago. I only do perl, because I couldn't get the C compiler to ever work, after loaning a big purple C book from my middle school library. Don't know why the book even was there in that library.

Many decades ago, perl5 was supposed to be a better shell script, or batch file. And perl was promised to be a single executable disk binary. Not 100s or 1000s of disk files for the C++ STL/.NET/Node/Java base class libraries, just to do hello world. A broken first ever newbie dev perl install that can do atleast -E"say()" will maybe keep someone in the perl community. A "features.pm file not found" error, well, that person moves onto another programing language in a few minutes and never looks back at perl.

node.bin is a fat packed pre-compiled pre-jitted ~70 MB single OS binary file with the basic class library burned in (undump() sort of). No env var or broken install problems on that platform. The .js files on disk from the installer are only for the JS debugger to use. Node can't be compared to P5, but P5 can atleast knock the low hanging fruit off and embed the basic .pm/pragmas or primary execution paths of them/lazy load pragma .pm'es etc.


  • This set of changes requires a perldelta entry, and it is included.

hv_store(inc_hv, "builtin.pm", STRLENs("builtin.pm"), newSVpvs(__FILE__), 0);
ver_gv = gv_fetchpvs("builtin::VERSION", GV_ADDMULTI, SVt_PV);
ver_sv = GvSV(ver_gv);
/* Remember to keep $VERSION in this file and $VERSION in builtin.pm synced. */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to have a test for that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the test is in this patch already. Change either side to 0.017 and test fails.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Apologies, I missed that.

-builtin.pm is now primarily for POD and .pm indexing tools, core, CPAN or
user written.  It also is a backup mechanism for very strange %INC
localization, clearing, or manipulation done by users, probably in a .t,
and whatever %INC manipulation is being done is probably developer error.
-This removes all the libc/kernel I/O calls for builtin.pm, and Perl code
parser overhead.
-A large benefit is, this commit is 50% of the work, to make

perl -E 'say "hi";'

"/lib"-less or not dependent on any file I/O. perl.bin, libperl.so,
and miniperl.bin should be able to execute as a standalone binary.
If perl -e "1;" doesn't need a dozen separate library files,
perl -E "1;" also shouldn't need a dozen files.

perl -E "say 'Hello world';" should work, even with a broken perl
installation or unreachable "/lib/*.pm"s or broken "portable" perls.

Only a feature.pm dep is left, for -E to be lib-less.  That is for another
patch and PR in the the future.
@bulk88 bulk88 force-pushed the remove_builtin_package_IO_dep_on_dot_pm branch from d5208db to eb882bf Compare October 24, 2024 21:24
@bulk88
Copy link
Contributor Author

bulk88 commented Oct 24, 2024

repushed,, forgot to stage an extra sentence comment in the .pm

builtin.c Outdated
@@ -774,6 +774,9 @@ XS(XS_builtin_import)
void
Perl_boot_core_builtin(pTHX)
{
HV * inc_hv;
GV * ver_gv;
SV * ver_sv;
I32 i;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we fold these declarations into the assignments below now that we're C99? That would seem more readable to me. The loop index should probably also move but that might be considered out of scope for this PR.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done in latestest .rev

@Leont
Copy link
Contributor

Leont commented Oct 24, 2024

This would mean %INC would no longer by empty at startup, right? That will actually break code of mine. I suspect there will be tests broken by this too.

@bulk88
Copy link
Contributor Author

bulk88 commented Oct 24, 2024

This would mean %INC would no longer by empty at startup, right? That will actually break code of mine. I suspect there will be tests broken by this too.

Cperl had a dozen things in %INC on startup. I dont remember any bug reports from back then and nothing comes up on google/GH with bug reports or complaints, but sample size of people who would write a ticket, is small, for that fork, but I wouldve found a ticket of some kind if it was a common pattern to depend on deep compare of %INC against a const hash.

%INC preloaded sounds safer than hard-coding pp_require to skip the IO on string_eq. "UNAUTHORIZED RELEASE"/CPAN/future back ports, IDK what, delete $INC{} and require(); is cargo culted for forking a module. If its a XS .so you are on your own to wipe the package glob without SEGVs. %INC preloads can be undone by anyone at anytime, a memcmp() in pp_require can't easily be undone or requires a documented backdoor/provision.

Or idea 3 /lib/strict/PP.pm and hardcode str_eq/memcmp into pp_require. Remember the orig PP .pm impl if it exists, has to stay around for some months/years, and has to be CI-ed against the builtin XS version on a make test. It might be build only, never install the .pm, maybe keep .pod separate, but the PP .pm has to initially stay around I think for CI.

Another argument is, hard coding string_eq skips into pp_require() is fine. strict.pm module/package/disk file is P5P owned, random local() monkey patching CVs/globs by random modules, where the developer decided to runtime replace vars token in use strict 'vars'; with use strict 'dacostumers'; is not supported ever.

Another argument, the upstream dep list of any CPAN/Core module is UB. There is no API contract that a module will never add or remove a upstream dep between released. And the interp is a module ever since DynaLoader:: was introduced, or maybe https://perl5.git.perl.org/perl5.git/blob/e334a159a5616cab575044bafaf68f75b7bb3a16:/usub/curses.mus so if %INC is suddenly dirty, that was never promised. See this line in modern perl

perl5/gv.c

Line 1235 in 9a9d70c

if (stash_name && memEQs(stash_name, HvNAMELEN_get(stash), "IO::File")

@bulk88
Copy link
Contributor Author

bulk88 commented Oct 25, 2024

Another idea, P5P modules upstream or p5p not shipped on CPAN, The "no updates unless you install new major release kinds of module

I propose p5p modules, from now they are blacklisted from %INC and should only appear in new ``%^INC` global hash, the hash should be RO from day. light foot steps towards code signing/DRM/Security/tampering with .pm files by a maids/auditing perl interps for useless govt reports

-builtin.pm is now primarily for POD and .pm indexing tools, core, CPAN or
user written.  It also is a backup mechanism for very strange %INC
localization, clearing, or manipulation done by users, probably in a .t,
and whatever %INC manipulation is being done is probably developer error.
-This removes all the libc/kernel I/O calls for builtin.pm, and Perl code
parser overhead.
-A large benefit is, this commit is 50% of the work, to make

perl -E 'say "hi";'

"/lib"-less or not dependent on any file I/O. perl.bin, libperl.so,
and miniperl.bin should be able to execute as a standalone binary.
If perl -e "1;" doesn't need a dozen separate library files,
perl -E "1;" also shouldn't need a dozen files.

perl -E "say 'Hello world';" should work, even with a broken perl
installation or unreachable "/lib/*.pm"s or broken "portable" perls.

Only a feature.pm dep is left, for -E to be lib-less.  That is for another
patch and PR in the the future.
-silence nearby MSVC x64 only truncation warnings
@tonycoz
Copy link
Contributor

tonycoz commented Nov 4, 2024

The original specification of builtin suggested we might ship builtin as a module on CPAN that implemented backports of at least some of the builtins.

From what I can see this change prevents such an implementation from working.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants