JIT: Visit blocks in RPO during LSRA #107927

amanasifkhalid · 2024-09-17T17:03:03Z

Part of #107749. LSRA's currently does a lexical pass over the blocklist to build a visitation order. Since we intend to run block layout after LSRA with #107634, LSRA ideally shouldn't be sensitive to lexical ordering, and since the current logic tries to visit a block's predecessors before the block itself, it seems easier and faster to just use an RPO traversal.

dotnet-policy-service · 2024-09-17T17:03:42Z

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

This reverts commit 4a9e0ff.

amanasifkhalid · 2024-09-17T22:29:32Z

/azp run runtime-coreclr outerloop, Fuzzlyn

azure-pipelines · 2024-09-17T22:29:56Z

Azure Pipelines successfully started running 2 pipeline(s).

amanasifkhalid · 2024-09-18T01:54:47Z

cc @dotnet/jit-contrib, @AndyAyersMS @kunalspathak PTAL. Fuzzlyn failures are known or NaN false positives.

Note that LSRA previously had three ordering strategies: the default preds-first one, lexical order, and randomized order (which was never implemented). RPO looks like a viable replacement for the first one, and lexical order doesn't make much sense if we plan to move block layout later, so I removed the functionality for specifying LSRA's block order. Is it ok to remove this for now, and add it back in if/when we decide to implement a randomized order?

amanasifkhalid · 2024-09-18T01:59:50Z

Diffs are large, though a net size improvement. Looking at the instructions retired per collection, the larger MinOpts TP regressions are concentrated in collections with relatively few MinOpts methods, so I think the TP impact isn't as bad as it looks.

kunalspathak · 2024-09-18T03:41:32Z

Fuzzlyn linux/arm failures seems to expose some more issues by this change. We should address all of them before merging this PR.

amanasifkhalid · 2024-09-18T14:29:59Z

Fuzzlyn linux/arm failures seems to expose some more issues by this change. We should address all of them before merging this PR.

The assertion varDsc->IsAlwaysAliveInMemory() || ((regSet.GetMaskVars() & regMask) == 0) looks like the one you just fixed, though it doesn't match the assert listed at the bottom, otherTargetInterval->registerType == TYP_DOUBLE -- the latter assert's message references a method that doesn't exist in the trimmed repro. I'm guessing this is a Fuzzlyn bug? I can't reproduce either assert with the provided repro; I'll try rerunning Fuzzlyn and see if we get anything actionable.

amanasifkhalid · 2024-09-18T14:30:18Z

/azp run Fuzzlyn

azure-pipelines · 2024-09-18T14:30:31Z

Azure Pipelines successfully started running 1 pipeline(s).

amanasifkhalid · 2024-09-18T17:01:41Z

@kunalspathak the Linux arm failure didn't repro in the last Fuzzlyn run, so if it is a bug, it doesn't readily repro. Aside from inner/outerloop tests, are there any other suites you'd like me to run? JitStress doesn't seem to do anything interesting for block ordering, though if it's still worthwhile to run LSRA stress modes, I can do that -- thanks!

kunalspathak · 2024-09-18T17:04:03Z

@kunalspathak the Linux arm failure didn't repro in the last Fuzzlyn run, so if it is a bug, it doesn't readily repro. Aside from inner/outerloop tests, are there any other suites you'd like me to run? JitStress doesn't seem to do anything interesting for block ordering, though if it's still worthwhile to run LSRA stress modes, I can do that -- thanks!

Yes, since Fuzzlyn is randomized, it might not necessary repro on every run, but we should take the failure that we saw in previous run and see why it showed up with this changes and go from there. I would usually run *jitstressregs* pipelines too.

amanasifkhalid · 2024-09-18T17:06:01Z

/azp run runtime-coreclr jitstressregs, runtime-coreclr jitstressregs-x86, runtime-coreclr jitstress2-jitstressregs

azure-pipelines · 2024-09-18T17:06:29Z

Azure Pipelines successfully started running 3 pipeline(s).

amanasifkhalid · 2024-09-18T17:24:12Z

Note that the diffs from the latest run look quite different because the collections got a bit messed up on x64: we're missing coreclr_tests and libraries_tests*, and diffs from smoke_tests.nativeaot are included multiple times.

amanasifkhalid · 2024-09-20T19:53:49Z

What is loop-aware "RPO"?

When visiting a block during an RPO traversal, we check if the block is a loop header, and if so, we visit the rest of the loop's body before visiting anything else -- value numbering currently does this, if you want to see what the implementation looks like. This visit ordering has the nice property of keeping loop bodies compact.

AndyAyersMS · 2024-09-24T16:22:57Z

Regressions:

[Perf] Linux/x64: 147 Regressions on 9/20/2024 7:38:18 PM #108201
[Perf] Windows/x64: 68 Regressions on 9/20/2024 7:38:18 PM perf-autofiling-issues#41660
[Perf] Windows/x64: 66 Regressions on 9/20/2024 7:38:18 PM perf-autofiling-issues#41650
[Perf] Linux/x64: 88 Regressions on 9/20/2024 7:38:18 PM perf-autofiling-issues#41713
[Perf] Linux/arm64: 29 Regressions on 9/20/2024 10:34:42 PM perf-autofiling-issues#41879
[Perf] Windows/arm64: 3 Regressions on 9/20/2024 10:34:42 PM perf-autofiling-issues#42459
[Perf] Windows/arm64: 4 Regressions on 9/21/2024 2:34:59 PM perf-autofiling-issues#42460
[Perf] Windows/arm64: 3 Regressions on 9/21/2024 8:41:31 PM perf-autofiling-issues#42461

Improvements:

kunalspathak · 2024-09-25T23:01:37Z

Regressions:

[Perf] Linux/x64: 147 Regressions on 9/20/2024 7:38:18 PM #108201

[Perf] Windows/x64: 68 Regressions on 9/20/2024 7:38:18 PM perf-autofiling-issues#41660

[Perf] Windows/x64: 66 Regressions on 9/20/2024 7:38:18 PM perf-autofiling-issues#41650

[Perf] Linux/x64: 88 Regressions on 9/20/2024 7:38:18 PM perf-autofiling-issues#41713

Improvements:

[Perf] Windows/x64: 80 Improvements on 9/20/2024 7:38:18 PM perf-autofiling-issues#41669

[Perf] Linux/x64: 18 Improvements on 9/20/2024 7:38:18 PM perf-autofiling-issues#41744

[Perf] Linux/x64: 60 Improvements on 9/20/2024 7:38:18 PM perf-autofiling-issues#41735

[Perf] Windows/x64: 37 Improvements on 9/20/2024 7:38:18 PM perf-autofiling-issues#41694

Do we have an issue that we moved in dotnet/runtime repo? There are more regressions than improvements even though the asmdiff codesize/perfscore found otherwise. We should revisit some of the regressions.

AndyAyersMS · 2024-09-25T23:03:54Z

The first one #108201

Still waiting on arm64 results (tomorrow). Also some of this might be mitigated by loop-aware.

kunalspathak · 2024-09-25T23:09:19Z

Also some of this might be mitigated by loop-aware.

@amanasifkhalid - can we try locally if loop-aware helps mitigate some of the regressions?

amanasifkhalid · 2024-09-26T14:16:48Z

@amanasifkhalid - can we try locally if loop-aware helps mitigate some of the regressions?

Sure, I'll try that today. I'll have the top regressions collated here soon.

amanasifkhalid · 2024-09-26T15:23:47Z

x64 regressions:

Recent Score	Orig Score	Linux x64	Windows x64	ViperLinux x64	ViperWindows x64	Benchmark
1.97	1.97			1.97 1.97		System.Tests.Perf_String.IndexerCheckLengthHoisting
1.75	1.75		1.75 1.75			System.Text.Json.Document.Tests.Perf_EnumerateArray.EnumerateArray(TestCase: ArrayOfStrings)
1.74	1.75				1.74 1.75	System.Collections.IterateForEachNonGeneric(String).SortedList(Size: 512)
1.73	1.44				1.73 1.44	System.Threading.Tasks.ValueTaskPerfTest.CreateAndAwait_FromResult_ConfigureAwait
1.70	1.75				1.70 1.75	System.Collections.IterateForEachNonGeneric(Int32).SortedList(Size: 512)
1.70	1.71			1.70 1.71		System.Globalization.Tests.StringSearch.IndexOf_Word_NotFound(Options: (en-US, OrdinalIgnoreCase, False))
1.56	1.26	1.56 1.26				System.Collections.Sort(IntStruct).LinqOrderByExtension(Size: 512)
1.55	1.28	1.55 1.28				System.Collections.Sort(IntStruct).LinqQuery(Size: 512)
1.54	1.55	1.79 1.79	1.63 1.65	1.25 1.25		System.Globalization.Tests.StringSearch.IndexOf_Word_NotFound(Options: (en-US, None, False))
1.54	1.55	1.80 1.80	1.63 1.65	1.24 1.25		System.Globalization.Tests.StringSearch.IndexOf_Word_NotFound(Options: (, None, False))
1.52	1.27	1.52 1.27				System.Memory.Span(Int32).Fill(Size: 512)
1.51	1.52		1.51 1.52			System.Linq.Tests.Perf_Enumerable.SelectToList(input: List)
1.49	1.49	1.78 1.78		1.24 1.24		System.Globalization.Tests.StringSearch.IndexOf_Word_NotFound(Options: (en-US, IgnoreNonSpace, False))
1.47	1.44	1.47 1.44				System.Tests.Perf_Version.TryFormat4
1.47	1.47	1.47 1.47				System.Linq.Tests.Perf_Enumerable.SelectToArray(input: Array)
1.45	1.39	1.69 1.55		1.25 1.24		System.Globalization.Tests.StringSearch.LastIndexOf_Word_NotFound(Options: (en-US, None, False))
1.44	1.40			1.44 1.40		System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "(?i)Sher[a-z]+
1.44	1.43	1.67 1.65		1.24 1.24		System.Globalization.Tests.StringSearch.LastIndexOf_Word_NotFound(Options: (, None, False))
1.43	1.49	1.65 1.80		1.24 1.24		System.Globalization.Tests.StringSearch.LastIndexOf_Word_NotFound(Options: (en-US, IgnoreNonSpace, False))
1.43	1.36			1.43 1.36		System.Globalization.Tests.StringSearch.LastIndexOf_Word_NotFound(Options: (en-US, Ordinal, False))
1.41	1.40	1.26 1.24	1.58 1.58			System.Text.Json.Document.Tests.Perf_EnumerateObject.EnumerateProperties(TestCase: StringProperties)
1.40	1.39	1.24 1.23	1.58 1.58			System.Text.Json.Document.Tests.Perf_EnumerateObject.EnumerateProperties(TestCase: NumericProperties)
1.39	1.45			1.39 1.45		System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "(?i)Sherlock
1.39	1.39	1.23 1.23	1.23 1.27	1.53 1.53	1.59 1.55	System.Linq.Tests.Perf_OrderBy.OrderByCustomComparer(NumberOfPeople: 512)
1.38	1.46	1.38 1.46				System.Tests.Perf_UInt64.TryFormat(value: 18446744073709551615)
1.37	1.38		1.37 1.38			System.Linq.Tests.Perf_Enumerable.WhereAny_LastElementMatches(input: List)
1.36	1.36		1.36 1.36			SciMark2.kernel.benchSparseMult
1.35	1.62	1.35 1.62				System.Tests.Perf_String.Replace_Char(text: "This is a very nice sentence", oldChar: 'z', newChar: 'y')
1.33	1.25	1.46 1.30			1.22 1.20	System.Linq.Tests.Perf_Enumerable.All_AllElementsMatch(input: IEnumerable)
1.33	1.31	1.23 1.23		1.44 1.40		System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "aqj", Options: NonBacktracking)
1.33	1.34	1.33 1.34				System.Memory.Span(Int32).LastIndexOfAnyValues(Size: 512)
1.33	1.33	1.28 1.28	1.38 1.38			Microsoft.Extensions.Primitives.StringSegmentBenchmark.Equals_Valid
1.32	1.33	1.32 1.33				System.Collections.Tests.DictionarySequentialKeys.ContainsValue_17_Int_Int
1.31	1.31	1.23 1.25		1.40 1.38		System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "zqj", Options: None)
1.30	1.30	1.22 1.24		1.39 1.37		System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "zqj", Options: NonBacktracking)
1.30	1.33	1.23 1.25		1.36 1.41		System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "aqj", Options: None)
1.29	1.27		1.29 1.27			Benchstone.BenchI.BenchE.Test
1.29	1.79				1.29 1.79	Microsoft.Extensions.Primitives.StringSegmentBenchmark.Trim
1.29	1.32			1.29 1.32		System.Globalization.Tests.StringSearch.IndexOf_Word_NotFound(Options: (en-US, Ordinal, False))
1.28	1.29	1.28 1.29				System.Collections.TryGetValueTrue(Int32, Int32).SortedDictionary(Size: 512)
1.28	1.28	1.40 1.38		1.17 1.19		System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "\w+\s+Holmes\s+\w+", Options: NonBacktracking)
1.28	1.28			1.28 1.28		System.Buffers.Tests.ReadOnlySequenceTests(Byte).IterateGetPositionArray
1.27	1.25			1.27 1.25		System.Buffers.Tests.ReadOnlySequenceTests(Char).IterateGetPositionArray
1.27	1.27	1.27 1.27				System.Memory.Span(Int32).IndexOfAnyThreeValues(Size: 512)
1.27	1.26	1.44 1.43		1.11 1.11		System.Memory.ReadOnlySpan.IndexOfString(input: "foobardzsdzs", value: "rddzs", comparisonType: InvariantCulture)
1.27	1.27	1.27 1.27				System.Linq.Tests.Perf_Enumerable.SelectToList(input: Range)
1.26	1.22	1.26 1.22				System.Collections.Tests.Perf_SortedSet.EnumerateViewBetween
1.26	1.27		1.26 1.27			System.Linq.Tests.Perf_Enumerable.WhereAny_LastElementMatches(input: Array)
1.26	1.27	1.26 1.27				System.Globalization.Tests.StringEquality.Compare_DifferentFirstChar(Count: 1024, Options: (en-US, OrdinalIgnoreCase))
1.26	1.25	1.38 1.36	1.14 1.14			System.Linq.Tests.Perf_Enumerable.WhereSingleOrDefault_LastElementMatches(input: Array)
1.25	1.25	1.25 1.25				System.Memory.Span(Int32).LastIndexOfAnyValues(Size: 33)
1.25	1.20		1.32 1.22		1.18 1.18	System.Collections.Perf_LengthBucketsFrozenDictionary.TryGetValue_True_FrozenDictionary(Count: 100, ItemsPerBucket: 1)
1.24	1.23	1.24 1.23				System.Tests.Perf_Int64.Parse(value: "-9223372036854775808")
1.24	1.23		1.24 1.23			System.Tests.Perf_DateTimeOffset.ToString(format: "r")
1.24	1.23			1.22 1.22	1.26 1.25	System.Collections.IterateForEach(Int32).IEnumerable(Size: 512)
1.23	1.23	1.23 1.23				System.Collections.ContainsFalse(String).LinkedList(Size: 512)
1.23	1.23			1.23 1.23		System.Globalization.Tests.StringSearch.LastIndexOf_Word_NotFound(Options: (en-US, OrdinalIgnoreCase, False))
1.23	1.24	1.23 1.24				System.Memory.Span(Int32).IndexOfAnyTwoValues(Size: 512)
1.23	1.21	1.23 1.21				System.Memory.Span(Int32).IndexOfAnyThreeValues(Size: 33)
1.23	1.25				1.23 1.25	System.Memory.Span(Int32).SequenceEqual(Size: 512)
1.23	1.25	1.23 1.25				System.Linq.Tests.Perf_Enumerable.SelectToArray(input: List)
1.23	1.22				1.23 1.22	System.Linq.Tests.Perf_Enumerable.Aggregate_Seed(input: IEnumerable)
1.23	1.22	1.19 1.19		1.26 1.25		System.Text.RegularExpressions.Tests.Perf_Regex_Industry_SliceSlice.Count(Options: None)
1.22	1.22		1.19 1.17		1.26 1.27	System.Linq.Tests.Perf_Enumerable.ToArray(input: IEnumerable)
1.22	1.20				1.22 1.20	System.Linq.Tests.Perf_Enumerable.FirstWithPredicate_LastElementMatches(input: IEnumerable)
1.22	1.23	1.22 1.23				System.Tests.Perf_Uri.Ctor(input: "http://xn--hst-sna.with.xn--nicode-2ya")
1.21	1.21	1.21 1.21				System.Tests.Perf_UInt64.TryParseHex(value: "3039")
1.21	1.22	1.21 1.22				System.Tests.Perf_Int64.ParseSpan(value: "-9223372036854775808")
1.21	1.21	1.21 1.21				System.Tests.Perf_Int64.TryParse(value: "-9223372036854775808")
1.21	1.21	1.21 1.21				System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "aei", Options: None)
1.21	1.16		1.29 1.20		1.13 1.13	System.Collections.Perf_LengthBucketsFrozenDictionary.TryGetValue_True_FrozenDictionary(Count: 10, ItemsPerBucket: 1)
1.21	1.21	1.19 1.19		1.22 1.23		System.Text.RegularExpressions.Tests.Perf_Regex_Industry_SliceSlice.Count(Options: NonBacktracking)
1.20	1.24				1.20 1.24	System.Linq.Tests.Perf_Enumerable.Repeat
1.20	1.19	1.20 1.19				System.Tests.Perf_Int64.TryParseSpan(value: "-9223372036854775808")
1.20	1.21		1.20 1.21			System.Linq.Tests.Perf_Enumerable.SelectToList(input: Array)
1.20	1.16			1.20 1.16		System.Tests.Perf_UInt64.TryParseHex(value: "FFFFFFFFFFFFFFFF")
1.20	1.18	1.20 1.18				ByteMark.BenchAssignRectangular
1.20	1.19	1.20 1.19				System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "aei", Options: NonBacktracking)
1.19	1.19			1.19 1.19		System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "\s[a-zA-Z]{0,12}ing\s", Options: NonBacktracking)
1.19	1.18	1.16 1.15	1.21 1.22			System.Linq.Tests.Perf_Enumerable.Zip(input: IEnumerable)
1.18	1.19	1.18 1.23	1.15 1.14		1.20 1.20	System.Linq.Tests.Perf_Enumerable.SingleWithPredicate_LastElementMatches(input: IEnumerable)
1.18	1.17	1.18 1.17				PerfLabTests.CastingPerf.FooObjCastIfIsa
1.18	1.17		1.18 1.17			System.Tests.Perf_String.Trim_CharArr(s: "Test", c: [' ', ' '])
1.17	1.17	1.15 1.15			1.20 1.20	System.Linq.Tests.Perf_Enumerable.AnyWithPredicate_LastElementMatches(input: IEnumerable)
1.17	1.18				1.17 1.18	System.Linq.Tests.Perf_Enumerable.WhereSelect(input: IEnumerable)
1.17	1.10	1.17 1.10				System.Numerics.Tensors.Tests.Perf_NumberTensorPrimitives(Int32).BitwiseAnd_Scalar(BufferLength: 128)
1.17	1.16	1.13 1.11			1.22 1.22	System.Linq.Tests.Perf_Enumerable.WhereFirst_LastElementMatches(input: IEnumerable)
1.17	1.18	1.17 1.18				System.Collections.Tests.Perf_PriorityQueue(Int32, Int32).Dequeue_And_Enqueue(Size: 100)
1.17	1.17	1.17 1.17				System.Memory.Span(Int32).IndexOfAnyTwoValues(Size: 33)
1.17	1.13			1.17 1.13		System.Collections.Perf_SingleCharFrozenDictionary.TryGetValue_False_FrozenDictionary(Count: 10000)
1.17	1.16	1.17 1.16				System.Tests.Perf_String.Replace_String(text: "This is a very nice sentence", oldValue: "bad", newValue: "nice")
1.17	1.19	1.17 1.19				System.Collections.TryGetValueFalse(Int32, Int32).ImmutableDictionary(Size: 512)
1.17	1.15	1.17 1.15				System.Collections.IterateForEach(Int32).ImmutableSortedSet(Size: 512)
1.16	1.16	1.20 1.21	1.17 1.14		1.12 1.13	System.Linq.Tests.Perf_Enumerable.Sum(input: IEnumerable)
1.16	1.15	1.16 1.15				System.Tests.Perf_Uri.Ctor(input: "https://a.much.longer.domain.name")
1.16	1.15	1.16 1.15				BenchmarksGame.KNucleotide_9.RunBench
1.16	1.16		1.17 1.17		1.15 1.16	System.Linq.Tests.Perf_Enumerable.Select(input: Array)
1.16	1.15				1.16 1.15	System.Buffers.Tests.RentReturnArrayPoolTests(Byte).MultipleSerial(RentalSize: 4096, ManipulateArray: False, Async: False, UseSharedPool: True)
1.15	1.20		1.15 1.20			System.Linq.Tests.Perf_OrderBy.OrderByValueType(NumberOfPeople: 512)
1.15	1.15				1.15 1.15	System.Linq.Tests.Perf_Enumerable.Select(input: List)
1.15	1.12	1.15 1.12				System.Tests.Perf_Version.TryFormatL
1.15	1.17		1.15 1.17			System.Perf_Convert.FromBase64String
1.15	1.15			1.15 1.15		System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "\w+\s+Holmes\s+\w+", Options: Compiled)
1.15	1.29		1.15 1.29			System.Linq.Tests.Perf_Enumerable.Count(input: IEnumerable)
1.15	1.17	1.14 1.17	1.15 1.17		1.16 1.17	System.Linq.Tests.Perf_Enumerable.Range
1.15	1.15	1.11 1.11			1.19 1.19	System.Linq.Tests.Perf_Enumerable.WhereAny_LastElementMatches(input: IEnumerable)
1.15	1.15		1.18 1.18		1.12 1.13	System.Linq.Tests.Perf_Enumerable.Reverse(input: IEnumerable)
1.15	1.14	1.15 1.14				System.Collections.Perf_Frozen(Int16).ToFrozenDictionary(Count: 64)
1.15	1.15	1.15 1.15				System.IO.Tests.Perf_FileInfo.ctor_str
1.14	1.13		1.14 1.13			System.Perf_Convert.FromBase64Chars
1.14	1.13				1.14 1.13	System.Collections.Perf_Frozen(ReferenceType).ToFrozenDictionary(Count: 4)
1.14	1.14			1.14 1.14		System.Globalization.Tests.StringSearch.LastIndexOf_Word_NotFound(Options: (en-US, IgnoreCase, False))
1.14	1.14			1.14 1.14		System.Memory.ReadOnlySpan.IndexOfString(input: "AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAXAAAAAAAAAAAAAAAAAAAAAAAAAAAAA", value: "x", comparisonType: InvariantCultureIgno
1.14	1.11	1.16 1.14		1.12 1.09		System.Tests.Perf_Uri.UnescapeDataString(input: "abc%20def%20ghi%20")
1.14	1.14			1.14 1.14		System.Text.RegularExpressions.Tests.Perf_Regex_Industry_SliceSlice.Count(Options: IgnoreCase, NonBacktracking)
1.14	1.12		1.14 1.12			Benchstone.BenchI.BubbleSort.Test
1.14	1.13			1.16 1.15	1.12 1.12	BenchmarksGame.ReverseComplement_1.RunBench
1.14	1.24	1.03 1.23			1.25 1.26	System.Linq.Tests.Perf_Enumerable.Where(input: Array)
1.14	1.19	1.17 1.33	1.17 1.17		1.07 1.09	System.Linq.Tests.Perf_Enumerable.ToList(input: IEnumerable)
1.13	1.14			1.13 1.14		System.Collections.Perf_SingleCharFrozenDictionary.TryGetValue_False_FrozenDictionary(Count: 1000)
1.13	1.13				1.13 1.13	System.Collections.IterateForEachNonGeneric(Int32).ArrayList(Size: 512)
1.13	1.11	1.13 1.11				System.Text.RegularExpressions.Tests.Perf_Regex_Common.Email_IsMatch(Options: None)
1.13	1.12	1.13 1.12				System.Globalization.Tests.StringSearch.IsSuffix_DifferentLastChar(Options: (en-US, IgnoreCase, False))
1.13	1.09	1.20 1.11			1.07 1.08	System.Linq.Tests.Perf_Enumerable.AnyWithPredicate_LastElementMatches(input: IOrderedEnumerable)
1.13	1.13		1.13 1.13			Benchstone.BenchI.LogicArray.Test
1.13	1.09			1.13 1.09		System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "(?s).*", Options: Compiled)
1.13	1.13			1.15 1.14	1.11 1.12	System.Tests.Perf_Single.TryParse(value: "3.4028235E+38")
1.13	1.13		1.07 1.07		1.19 1.19	System.Linq.Tests.Perf_Enumerable.Skip_One(input: IEnumerable)
1.13	1.10	1.11 1.09		1.15 1.12		System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Leipzig.Count(Pattern: ".{0,2}(Tom
1.13	1.13	1.13 1.13				System.Tests.Perf_String.Split(s: "ABCDEFGHIJKLMNOPQRSTUVWXYZ", arr: [' '], options: RemoveEmptyEntries)
1.13	1.17	1.12 1.20			1.14 1.14	Microsoft.Extensions.Primitives.StringSegmentBenchmark.IndexOfAny
1.13	1.13	1.17 1.16			1.09 1.10	System.Linq.Tests.Perf_Enumerable.Select(input: IEnumerable)
1.13	1.13	1.13 1.13				MicroBenchmarks.Serializers.Json_ToStream(IndexViewModel).JsonNet_
1.13	1.14			1.13 1.14		System.Collections.Perf_DefaultFrozenDictionary.TryGetValue_False_FrozenDictionary(Count: 10)
1.13	1.13			1.13 1.13		System.Collections.Perf_SubstringFrozenDictionary.TryGetValue_False_FrozenDictionary(Count: 1000)
1.13	1.14	1.13 1.14				System.Numerics.Tensors.Tests.Perf_NumberTensorPrimitives(Int32).Max_Scalar(BufferLength: 128)
1.12	1.15			1.12 1.15		System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Leipzig.Count(Pattern: ".{2,4}(Tom
1.12	1.13	1.13 1.15		1.12 1.11		System.Collections.Perf_SingleCharFrozenDictionary.TryGetValue_False_FrozenDictionary(Count: 10)
1.12	1.13	1.08 1.13	1.14 1.14		1.16 1.12	System.Linq.Tests.Perf_Enumerable.WhereSelect(input: List)
1.12	1.11	1.12 1.11				System.Tests.Perf_Enum.ToString_Flags(value: Red, Orange, Yellow, Green, Blue)
1.12	1.13			1.12 1.13		System.Numerics.Tests.Perf_BigInteger.Parse(numberString: 1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012
1.12	1.15		1.12 1.15			System.Collections.IterateForEach(Int32).SortedDictionary(Size: 512)
1.12	1.12		1.13 1.13			System.Numerics.Tests.Perf_BigInteger.ToStringX(numberString: 123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678
1.12	1.12			1.12 1.12		System.Globalization.Tests.StringSearch.IndexOf_Word_NotFound(Options: (, IgnoreCase, False))
1.12	1.11			1.12 1.12	1.12 1.11	System.Collections.CtorFromCollection(Int32).ConcurrentQueue(Size: 512)
1.12	1.12			1.12 1.12		System.Globalization.Tests.StringSearch.IndexOf_Word_NotFound(Options: (en-US, IgnoreCase, False))
1.12	1.11				1.12 1.11	System.Tests.Perf_DateTime.GetNow
1.12	1.13			1.12 1.13		System.Text.RegularExpressions.Tests.Perf_Regex_Common.Date_IsNotMatch(Options: Compiled)
1.12	1.12	1.12 1.12				MicroBenchmarks.Serializers.Xml_FromStream(Location).DataContractSerializer_BinaryXml_
1.12	1.11			1.14 1.14	1.09 1.09	System.Tests.Perf_Double.TryParse(value: "1.7976931348623157e+308")
1.12	1.10			1.12 1.10		System.Text.RegularExpressions.Tests.Perf_Regex_Common.Date_IsMatch(Options: Compiled)
1.12	1.11			1.12 1.11		System.Text.RegularExpressions.Tests.Perf_Regex_Common.Date_IsMatch(Options: IgnoreCase, Compiled)
1.12	1.14	1.10 1.09	1.14 1.20			Benchstone.BenchF.NewtR.Test
1.12	1.12		1.12 1.12			System.Numerics.Tests.Perf_BigInteger.Remainder(arguments: 1024,512 bits)
1.12	1.12	1.12 1.12				System.Linq.Tests.Perf_Enumerable.ElementAt(input: IEnumerable)
1.12	1.10	1.12 1.10				System.Collections.Sort(IntClass).Array(Size: 512)
1.12	1.11	1.12 1.11				System.Collections.CtorFromCollection(String).HashSet(Size: 512)
1.12	1.17	1.06 1.06		1.11 1.15	1.18 1.32	System.Buffers.Tests.ReadOnlySequenceTests(Byte).IterateGetPositionMemory
1.11	1.12			1.11 1.12		System.Globalization.Tests.StringSearch.LastIndexOf_Word_NotFound(Options: (, IgnoreCase, False))
1.11	1.13			1.11 1.13		System.Text.RegularExpressions.Tests.Perf_Regex_Common.Date_IsNotMatch(Options: IgnoreCase, Compiled)
1.11	1.08	1.11 1.08				System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "\w+\s+Holmes", Options: NonBacktracking)
1.11	1.09	1.11 1.09				System.Linq.Tests.Perf_Enumerable.SelectToArray(input: IList)
1.11	1.12		1.09 1.10		1.13 1.14	System.Linq.Tests.Perf_Enumerable.CastToSameType(input: IEnumerable)
1.11	1.11	1.11 1.11				System.Text.RegularExpressions.Tests.Perf_Regex_Common.Date_IsNotMatch(Options: None)
1.11	1.11		1.11 1.11			System.Linq.Tests.Perf_Enumerable.ToDictionary(input: List)
1.11	1.13			1.11 1.13		System.Text.RegularExpressions.Tests.Perf_Regex_Industry_SliceSlice.Count(Options: IgnoreCase)
1.11	1.08	1.11 1.08				System.IO.Tests.Perf_FileStream.SeekForward(fileSize: 1024, options: None)
1.11	1.11	1.05 1.05		1.18 1.18		System.Collections.Perf_SingleCharFrozenDictionary.TryGetValue_False_FrozenDictionary(Count: 100)
1.11	1.11	1.11 1.11				System.Numerics.Tensors.Tests.Perf_NumberTensorPrimitives(Int32).Add_Scalar(BufferLength: 128)
1.11	1.08	1.11 1.08				XmlDocumentTests.XmlNodeListTests.Perf_XmlNodeList.Enumerator
1.11	1.10		1.11 1.10			System.Buffers.Tests.RentReturnArrayPoolTests(Object).SingleSerial(RentalSize: 4096, ManipulateArray: False, Async: False, UseSharedPool: True)
1.11	1.10	1.11 1.10				ByteMark.BenchNumericSortJagged
1.11	1.11			1.12 1.12	1.10 1.10	System.Tests.Perf_Double.Parse(value: "1.7976931348623157e+308")
1.11	1.08			1.11 1.08		System.Tests.Perf_Single.TryParse(value: "12345")
1.11	1.11	1.14 1.15		1.07 1.07		System.Collections.Perf_SubstringFrozenDictionary.TryGetValue_True_FrozenDictionary(Count: 1000)
1.11	1.11			1.10 1.11	1.11 1.12	System.Tests.Perf_Decimal.TryParse(value: "123456.789")
1.11	1.16	1.11 1.16				System.Text.RegularExpressions.Tests.Perf_Regex_Cache.IsMatch(total: 400000, unique: 1, cacheSize: 15)
1.11	1.09			1.11 1.09		System.Tests.Perf_Decimal.Parse(value: "123456.789")
1.11	1.13	1.14 1.13	1.07 1.13			System.Linq.Tests.Perf_Enumerable.LastWithPredicate_FirstElementMatches(input: IOrderedEnumerable)
1.10	1.14		1.14 1.22		1.07 1.07	System.Collections.Perf_LengthBucketsFrozenDictionary.TryGetValue_True_FrozenDictionary(Count: 1000, ItemsPerBucket: 1)
1.10	1.09	1.10 1.09				System.Tests.Perf_Version.ToStringL
1.10	1.10	1.11 1.10		1.10 1.10		System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "(?s).*", Options: None)
1.10	1.08	1.10 1.08				System.Tests.Perf_Version.ToString4
1.10	1.10	1.10 1.10				System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "\s[a-zA-Z]{0,12}ing\s", Options: None)
1.10	1.17			1.10 1.17		System.Buffers.Tests.ReadOnlySequenceTests(Char).IterateGetPositionMemory
1.10	1.12	1.10 1.12				System.Text.RegularExpressions.Tests.Perf_Regex_Common.Email_IsNotMatch(Options: None)
1.10	1.11			1.10 1.11		System.Tests.Perf_Single.Parse(value: "12345")
1.10	1.10			1.10 1.10		System.Memory.ReadOnlySpan.IndexOfString(input: "AAAAA5AAAA", value: "5", comparisonType: InvariantCulture)
1.10	1.13	1.14 1.21			1.06 1.06	System.Linq.Tests.Perf_Enumerable.FirstWithPredicate_LastElementMatches(input: IOrderedEnumerable)
1.10	1.10		1.08 1.07		1.12 1.13	System.Linq.Tests.Perf_Enumerable.SelectToList(input: IEnumerable)
1.10	1.16		1.10 1.16			System.Tests.Perf_Int16.Parse(value: "0")
1.10	1.08		1.10 1.08			System.Collections.IterateFor(Int32).ImmutableList(Size: 512)
1.10	1.15	1.02 1.16	1.08 1.08		1.20 1.20	System.Linq.Tests.Perf_Enumerable.Where(input: List)
1.10	1.09				1.10 1.09	System.Collections.IterateForEachNonGeneric(Int32).Stack(Size: 512)
1.10	1.07	1.10 1.07				ByteMark.BenchLUDecomp
1.10	1.10			1.10 1.10		System.IO.Tests.Perf_Path.GetFullPathForLegacyLength
1.09	1.09			1.09 1.09		System.Collections.Perf_SingleCharFrozenDictionary.TryGetValue_True_FrozenDictionary(Count: 100)
1.09	1.08		1.09 1.08			System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Leipzig.Count(Pattern: ".{0,2}(Tom
1.09	1.08				1.09 1.08	System.Linq.Tests.Perf_Enumerable.Select(input: IList)
1.09	1.09			1.09 1.09		System.Collections.Perf_SubstringFrozenDictionary.TryGetValue_False_FrozenDictionary(Count: 10)
1.09	1.06	1.09 1.06				System.Text.RegularExpressions.Tests.Perf_Regex_Cache.IsMatch(total: 400000, unique: 7, cacheSize: 15)
1.09	1.09			1.09 1.09		System.Collections.Perf_SingleCharFrozenDictionary.TryGetValue_True_FrozenDictionary(Count: 10)
1.09	1.12	1.09 1.12				System.Collections.AddGivenSize(String).HashSet(Size: 512)
1.09	1.09			1.09 1.09		System.Collections.Perf_SingleCharFrozenDictionary.TryGetValue_True_FrozenDictionary(Count: 1000)
1.09	1.09				1.09 1.09	System.Collections.IterateForEachNonGeneric(String).Stack(Size: 512)
1.09	1.08				1.09 1.08	System.Tests.Perf_Single.Parse(value: "3.4028235E+38")
1.09	1.09				1.09 1.09	System.Diagnostics.Perf_Activity.EnumerateActivityEventsLarge
1.09	1.09		1.09 1.10		1.09 1.09	System.Linq.Tests.Perf_Enumerable.SelectToArray(input: IEnumerable)
1.09	1.11			1.09 1.11		System.Tests.Perf_Double.Parse(value: "12345")
1.09	1.09				1.09 1.09	System.Linq.Tests.Perf_Enumerable.OrderBy(input: IEnumerable)
1.09	1.08	1.09 1.08				System.Numerics.Tensors.Tests.Perf_NumberTensorPrimitives(Int32).Max_Vector(BufferLength: 128)
1.09	1.07				1.09 1.07	System.Threading.Tests.Perf_CancellationToken.CreateManyRegisterMultipleDispose
1.09	1.08	1.12 1.11		1.06 1.06		System.Text.RegularExpressions.Tests.Perf_Regex_Common.Uri_IsMatch(Options: None)
1.09	1.09	1.09 1.09				System.Collections.ContainsFalse(String).Span(Size: 512)
1.09	1.09			1.09 1.09		Benchstone.BenchF.NewtE.Test
1.09	1.08			1.09 1.08		System.Collections.Perf_DefaultFrozenDictionary.TryGetValue_True_FrozenDictionary(Count: 10)
1.08	1.09		1.08 1.09			System.Collections.Perf_LengthBucketsFrozenDictionary.TryGetValue_True_FrozenDictionary(Count: 10000, ItemsPerBucket: 1)
1.08	1.10		1.08 1.10			System.IO.Tests.Perf_StreamWriter.WriteCharArray(writeLength: 2)
1.08	1.08		1.08 1.08			System.Linq.Tests.Perf_Enumerable.OrderByThenBy(input: IEnumerable)
1.08	1.07	1.08 1.07				System.Collections.TryGetValueFalse(Int32, Int32).SortedDictionary(Size: 512)
1.08	1.09			1.08 1.09		System.Collections.Perf_SingleCharFrozenDictionary.TryGetValue_True_FrozenDictionary(Count: 10000)
1.08	1.07		1.08 1.07			System.Collections.AddGivenSize(Int32).IDictionary(Size: 512)
1.08	1.05	1.08 1.05				System.Collections.TryAddGiventSize(String).Dictionary(Count: 512)
1.08	1.09			1.08 1.09		System.Text.Encodings.Web.Tests.Perf_Encoders.EncodeUtf16(arguments: Url,&lorem ipsum=dolor sit amet,16)
1.08	1.07	1.08 1.07				System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Leipzig.Count(Pattern: "Twain", Options: None)
1.08	1.07	1.08 1.07				System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Leipzig.Count(Pattern: "Twain", Options: NonBacktracking)
1.07	1.08	1.07 1.08				System.Tests.Perf_Int128.Parse(value: "-170141183460469231731687303715884105728")
1.07	1.07				1.07 1.07	System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Mariomkas.Count(Pattern: "[\w]+://[^/\\s?#]+[^\\s?#]+(?:\?[^\\s#])?(?:#[^\\s])?", Options: Compiled)
1.07	1.08			1.07 1.08		System.Tests.Perf_Single.Parse(value: "-3.4028235E+38")
1.07	1.07			1.07 1.07	1.08 1.08	System.Text.Json.Document.Tests.Perf_EnumerateArray.EnumerateUsingIndexer(TestCase: Json400KB)
1.07	1.07			1.07 1.07		System.Buffers.Tests.ReadOnlySequenceTests(Char).IterateGetPositionSingleSegment
1.07	1.15	1.10 1.08			1.04 1.23	System.Memory.Span(Int32).BinarySearch(Size: 512)
1.07	1.06		1.07 1.06			System.Collections.CreateAddAndClear(Int32).SortedDictionary(Size: 512)
1.07	1.07	1.07 1.07				System.Tests.Perf_Int128.TryParseSpan(value: "-170141183460469231731687303715884105728")
1.07	1.06		1.07 1.06			System.Perf_Convert.ToBase64String(formattingOptions: InsertLineBreaks)
1.07	1.10	1.07 1.10				System.Tests.Perf_Uri.CtorIdnHostPathAndQuery(input: "https://a.much.longer.domain.name/path/with?key=value#fragment")
1.07	1.07			1.07 1.07		System.Collections.Perf_SubstringFrozenDictionary.TryGetValue_False_FrozenDictionary(Count: 100)
1.07	1.06		1.07 1.06			System.Collections.IndexerSet(String).ConcurrentDictionary(Size: 512)
1.07	1.13			1.07 1.13		System.Tests.Perf_Double.TryParse(value: "12345")
1.07	1.06	1.07 1.06				System.Collections.TryGetValueFalse(String, String).ConcurrentDictionary(Size: 512)
1.06	1.06		1.06 1.06			System.Buffers.Tests.RentReturnArrayPoolTests(Object).SingleSerial(RentalSize: 4096, ManipulateArray: False, Async: False, UseSharedPool: False)
1.06	1.06				1.06 1.06	System.Collections.TryGetValueTrue(Int32, Int32).Dictionary(Size: 512)
1.06	1.15	1.06 1.15				System.Linq.Tests.Perf_Enumerable.Take_All(input: IEnumerable)
1.06	1.05	1.07 1.06			1.06 1.05	System.Collections.IterateForEach(Int32).ImmutableList(Size: 512)
1.06	1.08			1.06 1.08		System.Collections.Perf_SubstringFrozenDictionary.TryGetValue_True_FrozenDictionary(Count: 100)
1.06	1.07	1.06 1.07				System.Text.Encodings.Web.Tests.Perf_Encoders.EncodeUtf16(arguments: Url,�2020,16)
1.06	1.08	1.06 1.08				System.Text.RegularExpressions.Tests.Perf_Regex_Industry_RustLang_Sherlock.Count(Pattern: "\b\w+n\b", Options: None)
1.06	1.06	1.06 1.06				System.MathBenchmarks.Double.Hypot
1.06	1.06		1.06 1.06			System.Perf_Convert.ToBase64CharArray(binaryDataSize: 1024, formattingOptions: InsertLineBreaks)
1.06	1.32	1.06 1.32				System.Text.Encodings.Web.Tests.Perf_Encoders.EncodeUtf16(arguments: Url,&lorem ipsum=dolor sit amet,512)
1.06	1.07		1.06 1.07			MicroBenchmarks.Serializers.Xml_ToStream(ClassImplementingIXmlSerialiable).DataContractSerializer_
1.06	1.09	0.94 1.09	1.09 1.09		1.16 1.10	System.Linq.Tests.Perf_Enumerable.WhereSelect(input: Array)
1.06	1.08	1.06 1.08				System.Tests.Perf_Uri.CtorIdnHostPathAndQuery(input: "http://xn--hst-sna.with.xn--nicode-2ya/path/with?key=value#fragment")
1.06	1.08	1.06 1.08				System.Collections.AddGivenSize(String).IDictionary(Size: 512)
1.06	1.05		1.06 1.05			System.Tests.Perf_Int128.TryFormat(value: -170141183460469231731687303715884105728)
1.06	1.06				1.06 1.06	System.Collections.TryGetValueTrue(Int32, Int32).IDictionary(Size: 512)
1.06	1.07			1.06 1.07		System.Text.RegularExpressions.Tests.Perf_Regex_Common.Date_IsMatch(Options: None)
1.06	1.05			1.06 1.05		System.Collections.Perf_SubstringFrozenDictionary.TryGetValue_False_FrozenDictionary(Count: 10000)
1.06	1.06		1.06 1.06			MicroBenchmarks.Serializers.Json_ToStream(IndexViewModel).DataContractJsonSerializer_
1.06	1.31	1.06 1.31				System.Collections.Sort(BigStruct).LinqOrderByExtension(Size: 512)
1.06	1.10				1.06 1.10	System.Collections.IterateForEachNonGeneric(String).ArrayList(Size: 512)
1.06	1.06			1.06 1.06		System.Collections.ContainsKeyFalse(Int32, Int32).FrozenDictionary(Size: 512)
1.05	1.05			1.05 1.05		System.Text.RegularExpressions.Tests.Perf_Regex_Industry_BoostDocs_Simple.IsMatch(Id: 10, Options: None)
1.05	1.12	0.99 1.11		1.08 1.17	1.09 1.09	System.Diagnostics.Perf_Activity.EnumerateActivityLinksLarge
1.05	1.06	1.05 1.06				System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Leipzig.Count(Pattern: "(?i)Tom
1.05	1.40	0.63 1.12	1.75 1.75			System.Text.Json.Document.Tests.Perf_EnumerateArray.EnumerateArray(TestCase: ArrayOfNumbers)
1.05	1.08	1.05 1.08				System.Memory.ReadOnlySequence.Slice_Repeat(Segment: Multiple)
1.05	1.13	1.05 1.13				System.Tests.Perf_Int16.TryParse(value: "32767")
1.05	1.06	1.05 1.06				System.Text.Json.Serialization.Tests.WriteJson(Location).SerializeToWriter(Mode: Reflection)
1.04	1.10	1.04 1.10				System.Tests.Perf_Enum.ToString_Format_NonFlags(value: Monday, format: "g")
1.04	1.09	1.04 1.09				System.Tests.Perf_Enum.ToString_Format_NonFlags(value: 7, format: "G")
1.04	1.06		1.04 1.06			MicroBenchmarks.Serializers.Xml_ToStream(XmlElement).DataContractSerializer_
1.04	1.06	1.04 1.06				BenchmarksGame.FannkuchRedux_2.RunBench(n: 10, expectedSum: 73196)
1.04	1.06	1.04 1.06				System.Tests.Perf_String.Split(s: "ABCDEFGHIJKLMNOPQRSTUVWXYZ", arr: [' '], options: None)
1.04	1.23	1.04 1.23				System.Text.Encodings.Web.Tests.Perf_Encoders.EncodeUtf16(arguments: Url,�2020,512)
1.04	1.16	1.04 1.16				System.Collections.AddGivenSize(Int32).Queue(Size: 512)
1.03	1.20	0.94 1.07			1.14 1.34	System.Linq.Tests.Perf_Enumerable.Where(input: IEnumerable)
1.03	1.24	1.03 1.24				System.Collections.AddGivenSize(Int32).Stack(Size: 512)
1.03	1.12	1.03 1.12				Benchstone.BenchI.XposMatrix.Test
1.01	1.18		1.01 1.18			System.Text.Json.Tests.Perf_Reader.ReadReturnBytes(IsDataCompact: False, TestCase: DeepTree)
1.01	1.09			1.01 1.09		System.Tests.Perf_String.Trim_CharArr(s: "Test ", c: [' ', ' '])
1.00	1.08			1.00 1.08		System.Diagnostics.Perf_Activity.EnumerateActivityTagObjectsLarge
1.00	1.43	1.00 1.43				System.Collections.CreateAddAndClear(Int32).Array(Size: 512)
1.00	1.08				1.00 1.08	MicroBenchmarks.Serializers.Json_ToString(Location).SystemTextJson_Reflection_
0.95	1.14				0.95 1.14	System.Memory.Span(Int32).StartsWith(Size: 512)
0.90	1.74				0.90 1.74	System.Collections.CtorFromCollection(Int32).ImmutableStack(Size: 512)

amanasifkhalid · 2024-09-26T15:31:33Z

x64 improvements:

Recent Score	Orig Score	Linux x64	Windows x64	ViperLinux x64	ViperWindows x64	Benchmark
1.20	0.87				1.20 0.87	System.Tests.Perf_Char.Char_IsLower(input: "Good afternoon, Constable!")
1.18	0.93	1.18 0.93				System.Linq.Tests.Perf_OrderBy.OrderByValueType(NumberOfPeople: 512)
1.09	0.88		1.09 0.88			System.Text.Perf_Ascii.ToLower_Bytes_Chars(Size: 128)
1.06	0.88		1.06 0.88			System.Memory.Span(Char).LastIndexOfValue(Size: 512)
1.03	0.86	1.03 0.86				System.Collections.IndexerSet(Int32).Dictionary(Size: 512)
1.01	0.94		1.01 0.94			System.Collections.Tests.Perf_PriorityQueue(Guid, Guid).K_Max_Elements(Size: 100)
1.01	0.83				1.01 0.83	System.Linq.Tests.Perf_Enumerable.WhereAny_LastElementMatches(input: Array)
1.01	0.93	1.01 0.93				System.Collections.IterateForEach(Int32).SortedSet(Size: 512)
1.00	0.84		1.00 0.84			System.Tests.Perf_String.IndexerCheckPathLength
1.00	0.74				1.00 0.74	System.Memory.Span(Int32).IndexOfValue(Size: 512)
1.00	0.70		1.00 0.70			System.Memory.Span(Int32).IndexOfAnyFiveValues(Size: 512)
1.00	0.89		1.00 0.89			System.Memory.Span(Int32).IndexOfAnyFiveValues(Size: 33)
1.00	0.87	1.00 0.87				System.Linq.Tests.Perf_Enumerable.Skip_One(input: IEnumerable)
0.98	0.91	0.98 0.91				System.Text.RegularExpressions.Tests.Perf_Regex_Common.MatchesWord(Options: None)
0.97	0.80	1.12 0.63		0.90 0.90	0.92 0.91	System.Text.Json.Document.Tests.Perf_EnumerateArray.EnumerateArray(TestCase: ArrayOfStrings)
0.97	0.91				0.97 0.91	System.Text.Perf_Ascii.ToUpper_Bytes(Size: 128)
0.97	0.93			0.97 0.93		System.Collections.CtorFromCollection(Int32).SortedList(Size: 512)
0.96	0.90		0.96 0.90			System.Tests.Perf_String.Trim(s: " Test")
0.96	0.93		0.96 0.93			System.Net.Tests.Perf_WebUtility.Decode_DecodingRequired
0.95	0.90	0.95 0.90				System.Text.Json.Tests.Perf_Reader.ReadSpanEmptyLoop(IsDataCompact: True, TestCase: Json400B)
0.95	0.94	0.95 0.94				System.Text.Json.Tests.Perf_Reader.ReadSingleSpanSequenceEmptyLoop(IsDataCompact: True, TestCase: Json400B)
0.95	0.91	0.95 0.91				System.Tests.Perf_Enum.ToString_Flags(value: 32)
0.95	0.80	1.05 0.79	0.86 0.81			System.Memory.Span(Char).BinarySearch(Size: 33)
0.95	0.91		0.95 0.91			System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Mariomkas.Count(Pattern: "[\w]+://[^/\\s?#]+[^\\s?#]+(?:\?[^\\s#])?(?:#[^\\s])?", Options: None)
0.94	0.93	0.94 0.93				System.Text.Json.Serialization.Tests.WriteJson(Hashtable).SerializeToString(Mode: Reflection)
0.94	0.94				0.94 0.94	System.Linq.Tests.Perf_Enumerable.Concat_Once(input: IEnumerable)
0.94	0.94	0.94 0.94				System.Text.Json.Document.Tests.Perf_EnumerateObject.PropertyIndexer(TestCase: NumericProperties)
0.94	0.91			0.94 0.91		PerfLabTests.GetMember.GetMethod2
0.94	0.94				0.94 0.94	System.Linq.Tests.Perf_Enumerable.Concat_TenTimes(input: IEnumerable)
0.94	0.93		0.94 0.93			LinqBenchmarks.Where01LinqQueryX
0.94	0.94		0.94 0.94			System.Tests.Perf_Int128.TryParse(value: "170141183460469231731687303715884105727")
0.94	0.93		0.94 0.93			LinqBenchmarks.Where01LinqMethodX
0.94	0.94			0.94 0.94		System.Text.Json.Document.Tests.Perf_EnumerateObject.PropertyIndexer(TestCase: ObjectProperties)
0.94	0.92	0.94 0.92				System.Text.Json.Serialization.Tests.WriteJson(Hashtable).SerializeToUtf8Bytes(Mode: SourceGen)
0.94	0.94		0.94 0.94			System.Text.RegularExpressions.Tests.Perf_Regex_Common.Email_IsNotMatch(Options: None)
0.94	0.90			0.94 0.90		PerfLabTests.GetMember.GetMethod12
0.94	0.91	0.94 0.91				System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Mariomkas.Count(Pattern: "[\w]+://[^/\\s?#]+[^\\s?#]+(?:\?[^\\s#])?(?:#[^\\s])?", Options: Compiled)
0.94	0.93		0.94 0.93			Span.IndexerBench.WriteViaIndexer1(length: 1024)
0.93	0.75				0.93 0.75	System.Memory.Span(Char).EndsWith(Size: 512)
0.93	0.93		0.93 0.93			System.Tests.Perf_String.Concat_CharEnumerable
0.93	0.93				0.93 0.93	System.Collections.Perf_LengthBucketsFrozenDictionary.TryGetValue_True_FrozenDictionary(Count: 100, ItemsPerBucket: 5)
0.93	0.90		0.93 0.90			System.Collections.IterateForEach(String).ImmutableSortedSet(Size: 512)
0.93	0.90	0.93 0.90				System.Text.Json.Serialization.Tests.WriteJson(Hashtable).SerializeObjectProperty(Mode: Reflection)
0.93	0.79	1.03 0.79	0.84 0.80			System.Memory.Span(Char).BinarySearch(Size: 512)
0.93	0.92	0.93 0.92				System.Text.RegularExpressions.Tests.Perf_Regex_Common.Uri_IsNotMatch(Options: None)
0.93	0.94		0.93 0.94			System.Collections.Concurrent.Count(Int32).Queue_EnqueueCountDequeue(Size: 512)
0.92	0.91		0.92 0.91			MicroBenchmarks.Serializers.Xml_FromStream(ClassImplementingIXmlSerialiable).DataContractSerializer_BinaryXml_
0.92	0.89	0.92 0.89				System.Text.Encodings.Web.Tests.Perf_Encoders.EncodeUtf16(arguments: UnsafeRelaxed,no (escaping /) required,512)
0.92	0.91	0.92 0.91				System.Tests.Perf_Enum.ToString_Flags(value: 36)
0.92	0.93		0.92 0.93			System.Collections.ContainsTrue(Int32).FrozenSet(Size: 512)
0.92	0.94		0.92 0.94			MicroBenchmarks.Serializers.Json_ToStream(IndexViewModel).JsonNet_
0.92	0.94		0.89 0.94		0.95 0.94	System.Text.Json.Document.Tests.Perf_EnumerateArray.EnumerateUsingIndexer(TestCase: ArrayOfNumbers)
0.92	0.92		0.92 0.92			Benchstone.MDBenchI.MDMidpoint.Test
0.92	0.92			0.93 0.93	0.91 0.92	System.Text.Json.Document.Tests.Perf_EnumerateObject.EnumerateProperties(TestCase: ObjectProperties)
0.92	0.84		0.92 0.84			System.Collections.TryGetValueFalse(Int32, Int32).FrozenDictionaryOptimized(Size: 512)
0.92	0.92			0.92 0.92		System.Linq.Tests.Perf_Enumerable.Take_All(input: IEnumerable)
0.92	0.92		0.92 0.92			System.Memory.Span(Int32).IndexOfAnyThreeValues(Size: 512)
0.92	0.92				0.92 0.92	System.IO.Tests.StreamReaderReadLineTests.ReadLine(LineLengthRange: [1025, 2048])
0.91	0.92		0.91 0.92			System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Leipzig.Count(Pattern: ".{2,4}(Tom
0.91	0.93	0.91 0.93				System.Text.RegularExpressions.Tests.Perf_Regex_Common.ReplaceWords(Options: None)
0.91	0.91			0.91 0.91		System.Collections.IterateForEach(Int32).ImmutableList(Size: 512)
0.91	0.92		0.91 0.92			System.Tests.Perf_String.Split(s: "ABCDEFGHIJKLMNOPQRSTUVWXYZ", arr: [' '], options: RemoveEmptyEntries)
0.91	0.90		0.91 0.90			System.Tests.Perf_String.Split(s: "ABCDEFGHIJKLMNOPQRSTUVWXYZ", arr: [' '], options: None)
0.91	0.93	0.91 0.93				System.Text.Json.Serialization.Tests.WriteJson(Hashtable).SerializeToUtf8Bytes(Mode: Reflection)
0.91	0.91			0.91 0.91		System.Text.Json.Tests.Perf_Get.GetUInt64
0.91	0.91			0.91 0.91		PerfLabTests.GetMember.GetMethod4
0.91	0.87		0.91 0.87			System.Tests.Perf_Int64.Parse(value: "12345")
0.91	0.90		0.91 0.90			System.Buffers.Tests.SearchValuesCharTests.IndexOfAnyExcept(Values: "abcdefABCDEF0123456789Ü")
0.91	0.90		0.91 0.90			ByteMark.BenchLUDecomp
0.91	0.90	0.91 0.90				System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Leipzig.Count(Pattern: "\p{Sm}", Options: None)
0.90	0.91		0.90 0.91			System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Leipzig.Count(Pattern: ".{0,2}(Tom
0.90	0.89	0.90 0.89				System.Collections.Perf_DefaultFrozenDictionary.TryGetValue_True_FrozenDictionary(Count: 1000)
0.90	0.90	0.90 0.90				System.Text.Json.Document.Tests.Perf_EnumerateObject.PropertyIndexer(TestCase: StringProperties)
0.90	0.90	0.90 0.90				System.Linq.Tests.Perf_Enumerable.SelectToArray(input: IEnumerable)
0.90	0.91		0.90 0.91			System.Linq.Tests.Perf_Enumerable.WhereSingleOrDefault_LastElementMatches(input: List)
0.90	0.91	0.90 0.91				System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Leipzig.Count(Pattern: "\p{Sm}", Options: NonBacktracking)
0.90	0.91		0.90 0.91			System.Buffers.Tests.ReadOnlySequenceTests(Byte).IterateForEachTenSegments
0.90	0.86	0.90 0.86				System.Collections.Tests.Perf_BitArray.BitArrayXor(Size: 512)
0.90	0.91		0.90 0.91			System.Linq.Tests.Perf_Enumerable.WhereSingle_LastElementMatches(input: List)
0.90	0.91			0.90 0.91	0.90 0.91	System.Text.Json.Document.Tests.Perf_EnumerateArray.EnumerateArray(TestCase: ArrayOfNumbers)
0.90	0.92			0.90 0.92		PerfLabTests.GetMember.GetMethod3
0.90	0.91	0.90 0.91				System.Collections.ContainsKeyTrue(String, String).ImmutableDictionary(Size: 512)
0.90	0.89	0.87 0.86	0.92 0.92			System.Text.RegularExpressions.Tests.Perf_Regex_Industry_Leipzig.Count(Pattern: "([A-Za-z]awyer
0.90	0.90		0.90 0.90			System.Linq.Tests.Perf_Enumerable.WhereSingle_LastElementMatches(input: IEnumerable)
0.90	0.91			0.90 0.91		PerfLabTests.GetMember.GetMethod10
0.90	0.89		0.90 0.89			System.Memory.Span(Int32).IndexOfAnyThreeValues(Size: 33)
0.89	0.93		0.89 0.93			System.Text.Json.Document.Tests.Perf_EnumerateArray.EnumerateUsingIndexer(TestCase: ArrayOfStrings)
0.89	0.92			0.89 0.92		PerfLabTests.GetMember.GetMethod5
0.89	0.85	0.89 0.85				System.Memory.Span(Char).Clear(Size: 512)
0.89	0.89			0.91 0.90	0.87 0.88	System.Collections.Perf_LengthBucketsFrozenDictionary.TryGetValue_True_FrozenDictionary(Count: 1000, ItemsPerBucket: 5)
0.89	0.89		0.89 0.89			System.Linq.Tests.Perf_Enumerable.WhereSingleOrDefault_LastElementMatches(input: IEnumerable)
0.89	0.83				0.89 0.83	System.Memory.Span(Int32).EndsWith(Size: 512)
0.89	0.88	0.89 0.88				System.Text.RegularExpressions.Tests.Perf_Regex_Common.MatchWord(Options: None)
0.89	0.88		0.89 0.88			System.Text.Perf_Ascii.ToUpper_Bytes(Size: 6)
0.89	0.88		0.88 0.88			System.Numerics.Tests.Perf_BigInteger.ToByteArray(numberString: 1234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456789012345678901234567890123456
0.88	0.89				0.88 0.89	System.Text.Json.Document.Tests.Perf_EnumerateObject.EnumerateProperties(TestCase: StringProperties)
0.88	0.90		0.88 0.90			System.Linq.Tests.Perf_Enumerable.Repeat
0.88	0.88				0.88 0.88	System.Text.Json.Document.Tests.Perf_EnumerateObject.EnumerateProperties(TestCase: NumericProperties)
0.88	0.76				0.88 0.76	System.Numerics.Tensors.Tests.Perf_NumberTensorPrimitives(Single).Negate(BufferLength: 128)
0.88	0.88		0.86 0.86		0.90 0.90	System.IO.Tests.Perf_StreamWriter.WriteString(writeLength: 2)
0.88	0.87	0.88 0.87				System.Text.Encodings.Web.Tests.Perf_Encoders.EncodeUtf16(arguments: UnsafeRelaxed,no (escaping /) required,16)
0.88	0.85	0.84 0.82	0.92 0.88			System.Tests.Perf_Int64.ParseSpan(value: "12345")
0.88	0.87	0.88 0.87				System.Buffers.Tests.ReadOnlySequenceTests(Char).IterateGetPositionTenSegments
0.87	0.75				0.87 0.75	Microsoft.AspNetCore.Server.Kestrel.Performance.PipeThroughputBenchmark.Parse_SequentialAsync(Length: 128, Chunks: 16)
0.87	0.88		0.87 0.88			System.Tests.Perf_Int64.TryParse(value: "9223372036854775807")
0.87	0.87	0.87 0.87				System.Buffers.Tests.ReadOnlySequenceTests(Byte).IterateGetPositionTenSegments
0.87	0.84				0.87 0.84	System.Collections.IterateFor(String).ImmutableList(Size: 512)
0.87	0.86	0.87 0.86				System.Numerics.Tests.Perf_BigInteger.Subtract(arguments: 1024,1024 bits)
0.87	0.87				0.87 0.87	System.Collections.IterateForEach(Int32).SortedDictionary(Size: 512)
0.87	0.86				0.87 0.86	System.Diagnostics.Perf_Activity.EnumerateActivityTagsSmall
0.87	0.87		0.86 0.86		0.88 0.88	System.Text.Perf_Ascii.ToLower_Chars(Size: 6)
0.87	0.87	0.87 0.87				System.Tests.Perf_UInt64.Parse(value: "0")
0.87	0.84	0.87 0.84				System.Buffers.Tests.ReadOnlySequenceTests(Byte).IterateTryGetTenSegments
0.86	0.86	0.86 0.86				System.Collections.Tests.Perf_BitArray.BitArrayRightShift(Size: 512)
0.86	0.82		0.86 0.82			System.Memory.Span(Byte).IndexOfAnyThreeValues(Size: 512)
0.86	0.86		0.88 0.89		0.85 0.84	System.Runtime.InteropServices.Tests.SafeHandleTests.AddRef_GetHandle_Release
0.86	0.83		0.86 0.83			System.Numerics.Tensors.Tests.Perf_NumberTensorPrimitives(Single).IndexOfMax(BufferLength: 3079)
0.86	0.89		0.86 0.89			Microsoft.Extensions.Primitives.StringSegmentBenchmark.GetSegmentHashCode
0.86	0.87			0.86 0.87		PerfLabTests.GetMember.GetMethod15
0.86	0.84	0.86 0.84				System.Collections.ContainsTrue(String).ImmutableHashSet(Size: 512)
0.85	0.84				0.85 0.84	System.IO.Tests.StreamReaderReadLineTests.ReadLineAsync(LineLengthRange: [1025, 2048])
0.85	0.93		0.85 0.93			System.Net.Primitives.Tests.IPAddressPerformanceTests.TryFormat(address: 143.24.20.36)
0.85	0.85		0.85 0.85			System.Collections.IterateForEachNonGeneric(String).Queue(Size: 512)
0.85	0.88	0.85 0.88				System.Collections.TryGetValueTrue(String, String).ImmutableDictionary(Size: 512)
0.85	0.73		0.85 0.73			System.Tests.Perf_Int64.TryParseSpan(value: "9223372036854775807")
0.85	0.83			0.85 0.83		System.Linq.Tests.Perf_Enumerable.ToDictionary(input: List)
0.85	0.82		0.85 0.82			System.Buffers.Text.Tests.Utf8ParserTests.TryParseDecimal(value: 123456.789)
0.85	0.85	0.85 0.85				System.Collections.IterateForEachNonGeneric(String).ArrayList(Size: 512)
0.85	0.75	0.85 0.75				Microsoft.Extensions.DependencyInjection.ActivatorUtilitiesBenchmark.GetService_1Injected
0.84	0.85		0.81 0.82		0.88 0.88	System.Text.Perf_Ascii.ToUpper_Chars(Size: 6)
0.84	0.83			0.84 0.83		System.Linq.Tests.Perf_Enumerable.ToDictionary(input: Array)
0.84	0.84		0.84 0.84			Microsoft.Extensions.Primitives.StringSegmentBenchmark.TrimStart
0.84	0.77	0.84 0.77				Microsoft.Extensions.DependencyInjection.ActivatorUtilitiesBenchmark.GetService_3Injected
0.84	0.85				0.84 0.85	System.Numerics.Tensors.Tests.Perf_NumberTensorPrimitives(Single).Add_Vector(BufferLength: 128)
0.84	0.88	0.84 0.88				System.Memory.Span(Char).Reverse(Size: 512)
0.83	0.82	0.83 0.83	0.84 0.81			System.Tests.Perf_Int64.TryParseSpan(value: "12345")
0.83	0.64		0.83 0.64			System.Memory.Span(Char).Reverse(Size: 33)
0.83	0.64	0.83 0.64				Benchstone.BenchI.BenchE.Test
0.82	0.81	0.82 0.81				System.IO.Tests.Perf_Path.GetFullPathNoRedundantSegments
0.82	0.82				0.82 0.82	System.Memory.Span(Char).SequenceCompareToDifferent(Size: 4)
0.82	0.82				0.82 0.82	System.Memory.Span(Char).SequenceCompareTo(Size: 33)
0.82	0.84	0.80 0.85	0.84 0.84			System.Numerics.Tests.Perf_BigInteger.Divide(arguments: 1024,512 bits)
0.82	0.83	0.82 0.83				System.Globalization.Tests.StringSearch.IndexOf_Word_NotFound(Options: (en-US, OrdinalIgnoreCase, False))
0.82	0.78	0.83 0.77	0.81 0.80			System.Collections.IterateForEachNonGeneric(Int32).Stack(Size: 512)
0.81	0.81				0.81 0.81	System.Memory.Span(Char).SequenceCompareToDifferent(Size: 33)
0.80	0.80				0.80 0.80	System.Memory.Span(Char).SequenceCompareToDifferent(Size: 512)
0.80	0.80	0.80 0.80				System.Globalization.Tests.StringSearch.IndexOf_Word_NotFound(Options: (, IgnoreCase, False))
0.80	0.80				0.80 0.80	System.Memory.Span(Int32).SequenceCompareToDifferent(Size: 4)
0.80	0.80				0.80 0.80	System.Memory.Span(Int32).SequenceCompareToDifferent(Size: 512)
0.80	0.80				0.80 0.80	System.Memory.Span(Int32).SequenceCompareToDifferent(Size: 33)
0.79	0.80		0.79 0.80			System.Linq.Tests.Perf_Enumerable.SelectToArray(input: List)
0.79	0.79	0.79 0.79				System.Globalization.Tests.StringSearch.IndexOf_Word_NotFound(Options: (en-US, IgnoreCase, False))
0.79	0.85	0.79 0.85				System.Tests.Perf_Enum.GetName_Generic_Flags
0.79	0.81	0.78 0.82	0.80 0.80			System.Collections.IterateForEachNonGeneric(String).Stack(Size: 512)
0.78	0.82		0.78 0.82			System.Memory.Span(Byte).BinarySearch(Size: 33)
0.78	0.85		0.78 0.85			System.Buffers.Text.Tests.Utf8ParserTests.TryParseUInt32Hex(value: FFFFFFFFFFFFFFFF)
0.78	0.79		0.78 0.79			System.IO.Tests.BinaryWriterTests.WriteHalf
0.77	0.77	0.77 0.77				System.Tests.Perf_DateTime.ToString(format: "r")
0.76	0.82		0.76 0.82			System.Memory.Span(Byte).BinarySearch(Size: 512)
0.76	0.73		0.76 0.73			System.Memory.Span(Int32).EndsWith(Size: 33)
0.75	0.75				0.75 0.75	Microsoft.AspNetCore.Server.Kestrel.Performance.PipeThroughputBenchmark.Parse_SequentialAsync(Length: 4096, Chunks: 16)
0.74	0.73	0.74 0.73				System.Linq.Tests.Perf_Enumerable.SelectToList(input: IList)
0.74	0.74	0.74 0.74				Benchstone.BenchI.Midpoint.Test
0.74	0.75		0.74 0.75			System.Memory.Span(Byte).EndsWith(Size: 33)
0.73	0.85		0.73 0.85			System.Tests.Perf_Int64.Parse(value: "9223372036854775807")
0.71	0.86		0.71 0.86			System.Tests.Perf_Int64.ParseSpan(value: "9223372036854775807")
0.69	0.69	0.69 0.69				System.Tests.Perf_Enum.IsDefined_Generic_Flags
0.68	0.63		0.68 0.63			System.Memory.Span(Byte).SequenceEqual(Size: 4)
0.67	0.71		0.67 0.71			System.Memory.Span(Char).EndsWith(Size: 33)
0.66	0.66		0.66 0.66			System.Memory.Span(Char).EndsWith(Size: 4)
0.66	0.70		0.66 0.70			System.Memory.Span(Byte).EndsWith(Size: 4)
0.65	0.65		0.65 0.65			System.Collections.Tests.Perf_BitArray.BitArrayCopyToBoolArray(Size: 512)
0.61	0.75		0.61 0.75			System.Memory.Span(Int32).EndsWith(Size: 4)
0.60	0.60		0.60 0.60			System.Collections.Tests.Perf_BitArray.BitArrayCopyToByteArray(Size: 512)
0.58	0.58	0.58 0.58				System.Globalization.Tests.StringEquality.Compare_Same_Upper(Count: 1024, Options: (en-US, OrdinalIgnoreCase))
0.20	0.20				0.20 0.20	System.Memory.Span(Int32).Clear(Size: 33)

kunalspathak · 2024-09-26T18:07:42Z

@amanasifkhalid - thanks for sharing the data, but can you please summarize the take away from it and next steps?

amanasifkhalid · 2024-09-26T19:19:41Z

thanks for sharing the data, but can you please summarize the take away from it and next steps?

I'm still trying to repro the top regressions locally -- if I revert this change on top of main, I don't see any meaningful change in benchmark results -- so I don't have any recommendations yet. I'll try reproing with the same baseline/diff commits from the regression report, and if I can repro it locally, I'll check if loop-aware RPO helps. Considering the most impacted benchmarks have loops, I expect loop-aware RPO will make a difference, though I don't know in which direction just yet.

You're correct that we have more regressions than improvements (286 vs 176) at the moment. To get an idea of how the magnitudes of regressions/improvements compare, here are some histograms:

Median: 1.13, Mean: 1.176

Median: 1.12, Mean: 1.162

Median: 0.875, Mean: 0.851

Median: 0.89, Mean: 0.871

Note that some improvements became regressions over time, and vice-versa, hence the odd tails for the recent scores. Looking at the original scores, it looks like the improvements tend to be bigger than the regressions, which seems promising for loop-aware RPO?

amanasifkhalid · 2024-09-27T17:42:31Z

I've looked at some regressions locally, and some look like they can easily be fixed by the loop-aware RPO. In the absence of high-fidelity edge likelihoods, we can end up with flowgraphs like this:


---------------------------------------------------------------------------------------------------------------------------------------------------------------------
BBnum BBid ref try hnd preds           weight   [IL range]   [jump]                            [EH region]        [flags]
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
BB01 [0000]  1                             1    [000..016)-> BB03(1)                 (always)                     i LIR hascall gcsafe idxlen
BB02 [0015]  1       BB03                 16    [018..023)-> BB03(1)                 (always)                     i LIR loophead idxlen bwd
BB03 [0001]  2       BB01,BB02             8    [016..023)-> BB04(0.5),BB02(0.5)     ( cond )                     i LIR keep loophead idxlen bwd
BB04 [0016]  1       BB03                  1    [022..03D)                           (return)                     i LIR idxlen bwd
---------------------------------------------------------------------------------------------------------------------------------------------------------------------

Since the profile-aware RPO only considers edge likelihoods, there isn't an obvious successor of BB03 to visit next, so it's easy to break up the loop body. In this case, the RPO does:

Final LSRA Block Sequence:
BB01 (  1   )
BB03 (  8   )
BB04 (  1   )
BB02 ( 16   )

Whereas the old LSRA block order uses block weights to decide on the next successor, so it gets this one right:

Final LSRA Block Sequence:
BB01 (  1   )
BB03 (  8   )
BB02 ( 16   )
BB04 (  1   )

The loop-aware RPO gets such examples right because of the presence of loops, but it's otherwise not aware of successor blocks' weights. For example, consider this flowgraph, which doesn't have any loops:

---------------------------------------------------------------------------------------------------------------------------------------------------------------------
BBnum BBid ref try hnd preds           weight   [IL range]   [jump]                            [EH region]        [flags]
---------------------------------------------------------------------------------------------------------------------------------------------------------------------
BB01 [0000]  1                             1    [000..008)-> BB02(0.5),BB06(0.5)     ( cond )                     i LIR
BB02 [0004]  1       BB01                  0.50 [025..02F)-> BB03(0.5),BB05(0.5)     ( cond )                     i LIR
BB03 [0006]  1       BB02                  0.50 [034..03B)-> BB04(0.5),BB10(0.5)     ( cond )                     i LIR
BB04 [0027]  1       BB03                  0.50 [034..03E)-> BB08(1)                 (always)                     i LIR
BB05 [0005]  1       BB02                  0.50 [02F..034)                           (return)                     i LIR
BB06 [0001]  1       BB01                  0.50 [008..013)-> BB07(0.5),BB09(0.5)     ( cond )                     i LIR
BB07 [0003]  1       BB06                  0.50 [018..025)-> BB08(1)                 (always)                     i LIR nullcheck
BB08 [0007]  2       BB04,BB07             0.50 [03E..040)                           (return)                     i LIR
BB09 [0002]  1       BB06                  0.50 [013..018)                           (return)                     i LIR
BB10 [0026]  1       BB03                  0    [034..035)                           (throw )                     i LIR rare hascall gcsafe
---------------------------------------------------------------------------------------------------------------------------------------------------------------------

Both RPO-based orderings interleave the cold block with the hot paths:

Final LSRA Block Sequence:
BB01 (  1   )
BB02 (  0.50)
BB03 (  0.50)
BB04 (  0.50)
BB10 (  0   )
BB05 (  0.50)
BB06 (  0.50)
BB07 (  0.50)
BB08 (  0.50)
BB09 (  0.50)

Whereas the previous implementation doesn't:

Final LSRA Block Sequence:
BB01 (  1   )
BB02 (  0.50)
BB03 (  0.50)
BB04 (  0.50)
BB05 (  0.50)
BB06 (  0.50)
BB07 (  0.50)
BB08 (  0.50)
BB09 (  0.50)
BB10 (  0   )

To handle these cases, I think we can emulate what we do during block reordering, and push rarely-run blocks to the end of the order. It's trivial to implement, and we don't have to worry about EH constraints like we do during block reordering. I think we eventually want these mismatches between likelihoods and block weights to disappear by running profile synthesis late in the frontend, though I don't think I'll get to that until later, so this seems like a decent fix for now.

For the remaining regressions I looked at, I'm seeing slight differences in code layout due to more critical edges split. This seems to happen in the case where the old ordering breaks ties using bbNums, and the RPO arbitrarily picks a different successor. I'd rather not re-introduce lexicality dependencies since these are likely to get in the way of moving block reordering completely to the backend, so perhaps we can start with the above changes, and see where we stand after.

@AndyAyersMS @kunalspathak does this all sound reasonable? Thanks!

AndyAyersMS · 2024-09-27T18:30:40Z

Emulating reordering seems plausible, I guess, but then perhaps we should simply run ordering before LSRA (and re-ordering later if there are new blocks), and have LSRA just use the lexical order?

For benchmark runs I'm surprised we don't see PGO everywhere... are we measuring non-PGO code in some tests?

amanasifkhalid · 2024-09-27T18:53:28Z

Emulating reordering seems plausible, I guess, but then perhaps we should simply run ordering before LSRA (and re-ordering later if there are new blocks), and have LSRA just use the lexical order?

I was thinking about going this route; from what we see above, better LSRA block orderings also tend to look like better block layouts, so it seems reasonable to just use lexical ordering. The only hurdles I see to this are the fact that we cannot move cold EH blocks to the end of the main body, and the fact that switch lowering can change flow in between block layout and LSRA. We already don't put much effort into ordering switch successors optimally (though 3-opt will probably fix this automatically), so maybe the latter point is fine? I'll give this a shot.

For benchmark runs I'm surprised we don't see PGO everywhere... are we measuring non-PGO code in some tests?

As far as I know, all the microbenchmarks use PGO; the non-PGO examples were PerfScore regressions handpicked from non-tiered SPMI collections to illustrate limitations. For the few benchmark regressions I was able to repro locally, the churn was primarily driven by more critical edges being split, and thus more churn in code layout. My understanding of LSRA's edge resolution is limited, but I don't see an obvious fix to these cases.

amanasifkhalid · 2024-09-27T18:57:47Z

Looking at the arm64 improvements, there are some benchmarks that were initially regressed by the new block layout earlier this year, and then fixed by the new LSRA ordering, which leads me to believe the final code layout was fine -- perhaps LSRA's old sequencing logic was negatively interacting with layout churn. For example:

This looks like further motivation to either decouple LSRA ordering from lexical ordering completely, or to merge them.

AndyAyersMS · 2024-09-27T19:54:58Z

we cannot move cold EH blocks to the end of the main body

Ah, good point... LSRA "layout" need not be EH aware at all.

amanasifkhalid · 2024-10-02T19:40:53Z

Ah, good point... LSRA "layout" need not be EH aware at all.

While snooping around LSRA, I noticed this TODO where the logic for remembering the first cold location assumes the first cold block is the beginning of a contiguous cold section. As mentioned above, block layout cannot satisfy this property when we have EH regions, so keeping this state accurate is important, then we cannot rely on layout order for LSRA. The current RPO traversal doesn't ensure cold blocks are visited last either, so I think it's worth pursuing enabling this invariant as a next step, alongside getting loop-aware RPO checked in.

(Sorry for the recent silence on this front. I've been trying to figure out the source of a TP regression for a massive MinOpts method in #108147, but I haven't been able to get a good trace from pin on multiple machines. That change is a nice-to-have, so I guess we can get these tweaks into LSRA's FullOpts block order first.)

Part of #107749, and follow-up to #107927. When computing a RPO of the flow graph, ensuring that the entirety of a loop body is visited before any of the loop's successors has the benefit of keeping the loop body compact in the traversal. This is certainly ideal when computing an initial block layout, and may be preferable for register allocation, too. Thus, this change formalizes loop-aware RPO creation as part of the flowgraph API surface, and uses it for LSRA's block sequence.

) Part of dotnet#107749, and follow-up to dotnet#107927. When computing a RPO of the flow graph, ensuring that the entirety of a loop body is visited before any of the loop's successors has the benefit of keeping the loop body compact in the traversal. This is certainly ideal when computing an initial block layout, and may be preferable for register allocation, too. Thus, this change formalizes loop-aware RPO creation as part of the flowgraph API surface, and uses it for LSRA's block sequence.

amanasifkhalid added 4 commits September 16, 2024 16:05

Visit blocks in RPO during LSRA

35fb3f3

Dead code removal

c86c069

Add unvisited blocks manually

4a9e0ff

Remove LSRA traversal order config

c1c7b0b

dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Sep 17, 2024

dotnet-policy-service bot assigned amanasifkhalid Sep 17, 2024

amanasifkhalid mentioned this pull request Sep 17, 2024

JIT: Flowgraph Modernization and Improved Block Layout in .NET 10 #107749

Open

30 tasks

build-analysis bot mentioned this pull request Sep 17, 2024

restarted. Azure DevOps can't recover from restarts. dotnet/dnceng#3879

Open

3 tasks

Revert "Add unvisited blocks manually"

831ee14

This reverts commit 4a9e0ff.

This was referenced Sep 17, 2024

LibraryImportGenerator.Unit.Tests crashing on linux-x64 mono interpreter #100800

Open

[mono][interpreter] Mono interpreter is crashing during System.Data.Odbc.Tests (linux-x64 Release Mono_Interpreter_LibrariesTests) #101370

Open

amanasifkhalid added 2 commits September 17, 2024 16:31

Simplify blockSequence building

288bc37

Add back assert

e18374d

amanasifkhalid marked this pull request as ready for review September 18, 2024 01:45

kunalspathak self-requested a review September 18, 2024 03:39

Merge branch 'main' into lsra-rpo

9a451c0

amanasifkhalid mentioned this pull request Sep 20, 2024

JIT: Add loop-aware RPO, and use as LSRA's block sequence #108086

Merged

LoopedBard3 mentioned this pull request Sep 24, 2024

[Perf] Linux/x64: 147 Regressions on 9/20/2024 7:38:18 PM #108201

Open

LoopedBard3 mentioned this pull request Sep 24, 2024

[Perf] Linux/x64: 31 Regressions on 9/20/2024 5:14:07 PM dotnet/perf-autofiling-issues#41712

Closed

amanasifkhalid mentioned this pull request Sep 27, 2024

JIT: Use linear block order for MinOpts in LSRA #108147

Open

sirntar pushed a commit to sirntar/runtime that referenced this pull request Sep 30, 2024

JIT: Visit blocks in RPO during LSRA (dotnet#107927)

580f3a4

LoopedBard3 mentioned this pull request Oct 3, 2024

[Perf] Windows/arm64: 1 Regression on 9/23/2024 6:26:31 AM dotnet/perf-autofiling-issues#42462

Closed

AndyAyersMS mentioned this pull request Oct 3, 2024

[Perf] Windows/arm64: 6 Improvements on 9/20/2024 10:34:42 PM dotnet/perf-autofiling-issues#42555

Closed

github-actions bot locked and limited conversation to collaborators Nov 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

JIT: Visit blocks in RPO during LSRA #107927

JIT: Visit blocks in RPO during LSRA #107927

amanasifkhalid commented Sep 17, 2024 •

edited

Loading

dotnet-policy-service bot commented Sep 17, 2024

amanasifkhalid commented Sep 17, 2024

azure-pipelines bot commented Sep 17, 2024

amanasifkhalid commented Sep 18, 2024

amanasifkhalid commented Sep 18, 2024

kunalspathak commented Sep 18, 2024

amanasifkhalid commented Sep 18, 2024

amanasifkhalid commented Sep 18, 2024

azure-pipelines bot commented Sep 18, 2024

amanasifkhalid commented Sep 18, 2024

kunalspathak commented Sep 18, 2024

amanasifkhalid commented Sep 18, 2024

azure-pipelines bot commented Sep 18, 2024

amanasifkhalid commented Sep 18, 2024

amanasifkhalid commented Sep 20, 2024

AndyAyersMS commented Sep 24, 2024 •

edited by LoopedBard3

Loading

kunalspathak commented Sep 25, 2024

AndyAyersMS commented Sep 25, 2024

kunalspathak commented Sep 25, 2024

amanasifkhalid commented Sep 26, 2024

amanasifkhalid commented Sep 26, 2024 •

edited

Loading

amanasifkhalid commented Sep 26, 2024

kunalspathak commented Sep 26, 2024

amanasifkhalid commented Sep 26, 2024

amanasifkhalid commented Sep 27, 2024 •

edited

Loading

AndyAyersMS commented Sep 27, 2024

amanasifkhalid commented Sep 27, 2024

amanasifkhalid commented Sep 27, 2024

AndyAyersMS commented Sep 27, 2024

amanasifkhalid commented Oct 2, 2024

JIT: Visit blocks in RPO during LSRA #107927

JIT: Visit blocks in RPO during LSRA #107927

Conversation

amanasifkhalid commented Sep 17, 2024 • edited Loading

dotnet-policy-service bot commented Sep 17, 2024

amanasifkhalid commented Sep 17, 2024

azure-pipelines bot commented Sep 17, 2024

amanasifkhalid commented Sep 18, 2024

amanasifkhalid commented Sep 18, 2024

kunalspathak commented Sep 18, 2024

amanasifkhalid commented Sep 18, 2024

amanasifkhalid commented Sep 18, 2024

azure-pipelines bot commented Sep 18, 2024

amanasifkhalid commented Sep 18, 2024

kunalspathak commented Sep 18, 2024

amanasifkhalid commented Sep 18, 2024

azure-pipelines bot commented Sep 18, 2024

amanasifkhalid commented Sep 18, 2024

amanasifkhalid commented Sep 20, 2024

AndyAyersMS commented Sep 24, 2024 • edited by LoopedBard3 Loading

kunalspathak commented Sep 25, 2024

AndyAyersMS commented Sep 25, 2024

kunalspathak commented Sep 25, 2024

amanasifkhalid commented Sep 26, 2024

amanasifkhalid commented Sep 26, 2024 • edited Loading

amanasifkhalid commented Sep 26, 2024

kunalspathak commented Sep 26, 2024

amanasifkhalid commented Sep 26, 2024

amanasifkhalid commented Sep 27, 2024 • edited Loading

AndyAyersMS commented Sep 27, 2024

amanasifkhalid commented Sep 27, 2024

amanasifkhalid commented Sep 27, 2024

AndyAyersMS commented Sep 27, 2024

amanasifkhalid commented Oct 2, 2024

amanasifkhalid commented Sep 17, 2024 •

edited

Loading

AndyAyersMS commented Sep 24, 2024 •

edited by LoopedBard3

Loading

amanasifkhalid commented Sep 26, 2024 •

edited

Loading

amanasifkhalid commented Sep 27, 2024 •

edited

Loading