-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Couldn't create custom Provider #86
Comments
Hi @suspectinside , I'm not able to reproduce this locally, as the following minimal code derived from your example runs okay on my end. I suspect that there's something else outside of your code example that causes this issue. Unfortunately, the logs you've noted doesn't exactly pinpoint the problem. Could you try out copying the code below to 3 different modules in your project to see if it works? # providers.py
import logging
from typing import Set
from collections.abc import Callable
from scrapy_poet.page_input_providers import PageObjectInputProvider
logger = logging.getLogger()
class Arq:
async def enqueue_task(self, task: dict):
logger.info('Arq.enqueue_task() enqueueing new task: %r', task)
class ArqProvider(PageObjectInputProvider):
provided_classes = {Arq}
name = 'ARQ_PROVIDER'
async def __call__(self, to_provide: Set[Callable]):
return [Arq()] # pageobjects.py
import attr
from web_poet.pages import Injectable, WebPage, ItemWebPage
from .providers import Arq
@attr.define
class IndexPage(WebPage):
arq: Arq
async def page_titles(self):
await self.arq.enqueue_task({'bla': 'bla!'})
return [
(el.attrib['href'], el.css('::text').get())
for el in self.css('.selected a.reference.external')
] # spiders/title_spider.py
import scrapy
from ..pageobjects import IndexPage
from ..providers import ArqProvider
class TitlesLocalSpider(scrapy.Spider):
name = 'titles.local'
start_urls = ["https://books.toscrape.com"]
custom_settings = {
"SCRAPY_POET_PROVIDERS": {
ArqProvider: 600, # MY PROVIDER FOR INJECTABLE arq: Arq
},
"DOWNLOADER_MIDDLEWARES": {
"scrapy_poet.InjectionMiddleware": 543,
},
}
async def parse(self, response, index_page: IndexPage):
self.logger.info(await index_page.page_titles)
|
Could |
I've tried adding the |
Yep! Thanks a lot, I could find the source of the problem - it happens if i use new builtins.set (with generics support) instead of depricated (since 3.9) typing.Set so, if i change async def __call__(self, to_provide: set[Callable], settings: Settings) -> Sequence[Callable]: into smth like this: from typing import Set
# ...
async def __call__(self, to_provide: Set[Callable], settings: Settings) -> Sequence[Callable]: everything works correctly. by the way, in any case, Scrapy-poet(Web-poet) is one of the best approaches i've ever seen and combinations of IoC and Page Object Model pattern for scrapping really shines! thanks a lot for it ;) |
...and just another one quick question: what's the best (more correct) way to provide Singleton object instance using scrapy-poet IoC infrastructure ? |
I see, great catch! I believe we can use the
I'm not quite sure how large of an undertaking would it be to completely move to the
💖 That'd be @kmike's work for you :)
Lot's of approaches on this one but I think the most convenient one is to assign it as a class variable in the provider itself. Technically, it's not a true singleton in this case since the |
Hi, just sample setup:
Injectable entity - arq: Arq. So, i'd like to work with arq instance here.
and i got the error like this:
So, could you pls explain why this error happens and how to fix it?
The text was updated successfully, but these errors were encountered: