-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix duplicated requests in SC UI #73
Conversation
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## master #73 +/- ##
==========================================
+ Coverage 95.53% 95.55% +0.01%
==========================================
Files 14 14
Lines 739 742 +3
==========================================
+ Hits 706 709 +3
Misses 33 33
☔ View full report in Codecov by Sentry. |
sh_scrapy/middlewares.py
Outdated
if type(response).__name__ == "DummyResponse": | ||
return response |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am personally not a fan of checking like this, for the record, but I am OK with it, and I realize it simplifies test/CI changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1; I was also thinking about a conditional import. But the current approach looks good enough, and it does simplify testing (though we're not testing the real use case explicitly - it's only tested via manual QA).
tests/test_middlewares.py
Outdated
|
||
@dataclass | ||
class DummyResponse: | ||
url: str |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor:
Just so that we're testing as close to scrapy-poet's DummyResponse
, what do you think about copying the 3 lines of code from https://github.com/scrapinghub/scrapy-poet/blob/957dc34808e46059a07dc69428d5d4dca6c71ecf/scrapy_poet/api.py#L10-L31 ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, not much code there - will copy and add, thank you @BurnzZ
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we might not need all of this code even in scrapy-poet; see scrapinghub/scrapy-poet#99
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so, I probably wouldn't copy the init method
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Anyways, it seems it doesn't matter much; no pushback at all on merging as-is.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
sh_scrapy/middlewares.py
Outdated
@@ -60,6 +60,11 @@ def process_request(self, request, spider): | |||
request.meta[HS_PARENT_ID_KEY] = request_id | |||
|
|||
def process_response(self, request, response, spider): | |||
# This class of response check is intended to fix the bug described here | |||
# https://github.com/scrapy-plugins/scrapy-zyte-api/issues/112 | |||
if type(response).__name__ == "DummyResponse": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We just discussed this in a meeting with @kmike.
In addition to the name, can we also check the import path? Something like
type(response).__module__ == "scrapy_poet.api"
or
type(response).__module__.startswith("scrapy_poet")
if we want to avoid problems in case the import path changes.
It probably does not happen often, but I'm concerned that any user-defined DummyResponse
will also trigger this code path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you @elacuesta , the fix is added.
Co-authored-by: Eugenio Lacuesta <[email protected]>
Co-authored-by: Eugenio Lacuesta <[email protected]>
The fix for two issues from here (scrapy-plugins/scrapy-zyte-api#112):
The idea is to avoid counting requests if the response is DummyResponse class.