Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot override log related spider settings #10

Open
andrewbaxter opened this issue Feb 18, 2015 · 8 comments
Open

Cannot override log related spider settings #10

andrewbaxter opened this issue Feb 18, 2015 · 8 comments

Comments

@andrewbaxter
Copy link

AFAICT it's not possible to override LOG_LEVEL, LOG_FILE, LOG_DIR, etc for spiders because the dict from get_scrapyrt_settings is applied with priority 'cmdline'.

I assume this is due to conflicting goals:

  1. Have scrapyrt be a "drop in" runner with no config changes required
  2. Have sane logging in the presence of multiple crawls

My take is

  1. The dict should have priority 'default' (since they really are defaults - the spider developer might want to customize them)
  2. scrapyrt should use a scrapyrt.cfg file rather than scrapy.cfg

scrapy.cfg is typically small enough that requiring the user to either copy it or use a template from the documentation wouldn't be a significant burden.

@andrewbaxter
Copy link
Author

@chekunkov
Copy link
Contributor

@andrewbaxter the idea behind using scrapy.cfg instead of any custom config file was to be able to run scrapyrt in any scrapy project difectory without making any changes - just run scrapyrt in project directory and you're done.

Priority 'cmdline' is used here because default ScrapyRT settings should have highest possible priority and override any project settings - ScrapyRT is relying on that. In opposite 'default' is a lowest possible priority and will be overridden by project settings. Don't think Scrapy's settings priority='default' applies here - overriding Scrapy's default setting can't cause harm, and here it can.

I think it would be easier to be able to override default ScrapyRT spider settings from CrawlManager. This way you will be able to remove or change any setting ScrapyRT is forcing.

@chekunkov
Copy link
Contributor

@andrewbaxter oh, I think I missed one more option that doesn't require any changes to ScrapyRT - just override CrawlManager and set any settings you want here This method returns Scrapy Settings which you can easily update

@andrewbaxter
Copy link
Author

Overriding get_scrapyrt_settings and get_project_settings is just as dangerous as changing the priority to default (or command - I misspoke, that's what we're actually using), right?

Also, CrawlManager overrides seem dependent on implementation details - if there's a chance that the implementation could change and silently break our code (ex: renaming one of those methods) it would be more reliable to create a local fork.

Anyway, we're getting by right now, but I would appreciate some sort of supported channel for making log settings changes.

@chekunkov
Copy link
Contributor

@andrewbaxter I'm thinking about allowing Scrapy settings with prefix SCRAPY_ in ScrapyRT settings module. So for instance to change default log level one could add following lines to scrapyrt_conf.py:

# ...
SCRAPY_LOG_LEVEL = log.DEBUG

and pass this config file to scrapyrt command

scrapyrt -S scrapyrt_conf

WDYT?

@pawelmhm
Copy link
Member

may be related to #62

@fcanobrash
Copy link
Contributor

@pawelmhm, what do you think about @andrewbaxter's idea on changing the priority in get_scrapyrt_settings to default?
This could solve many related issues and at first glance I don't see how giving the possibility to override those particular settings can be harmful. I might be wrong, otherwise I can write a PR for it.

@internalG
Copy link

Any plan for this? By default I find many log files in logs directory, it seems scrapyrt create one file for one request. It's somewhat unexpected. What's the best practice to config the log?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants