Compare commits

...

51 Commits

Author SHA1 Message Date
dgtlmoon
52df602af7 Merge branch 'master' into store-watch-configs-in-own-dir 2024-09-28 15:38:05 +02:00
dgtlmoon
1b625dc18a UI - "Filters & Triggers" - Live preview of text filters (Preview the output of the filters section in realtime) (#2612) 2024-09-28 10:40:47 +02:00
dgtlmoon
367afc81e9 Reversing subprocess execution - saved a little memory but used a LOT more CPU (#2659) 2024-09-27 21:36:02 +02:00
dgtlmoon
ddfbef6db3 [test] Use local data instead of reaching out to changedetection when testing (#2660) 2024-09-27 20:30:19 +02:00
dgtlmoon
e173954cdd Restock monitor - Only try to process restock information (like scraping for "out of stock" keywords) if the page was actually rendered correctly. (#2645) 2024-09-20 09:19:57 +02:00
dgtlmoon
e830fb2320 Text filters - Adding filters "Trim whitespace" and "Remove duplicate lines" 2024-09-18 15:45:44 +02:00
dgtlmoon
c6589ee1b4 Browser Steps - UI - Use a better flexbox layout 2024-09-18 11:26:10 +02:00
Michael McMillan
dc936a2e8a Filters - Add support for also removing HTML elements using XPath selectors (#2632) 2024-09-17 22:43:04 +02:00
dgtlmoon
8c1527c1ad Update AppRise notification library to 1.9.0 (#2624) 2024-09-17 19:06:17 +02:00
Dawid Wróbel
a5ff1cd1d7 browser_steps: add "click element containing text if exists" (#2629) 2024-09-17 18:30:54 +02:00
dgtlmoon
543cb205d2 Testing - Fixing Restock test #2641 2024-09-17 18:29:12 +02:00
dgtlmoon
273adfa0a4 Testing - Fix false filter missing check alerts 2024-09-17 16:55:04 +02:00
Felipe Tuffani
8ecfd17973 Restock/Price detection - Fix duplicated prices with different data type on single page product #2636 (#2638) 2024-09-17 11:22:54 +02:00
dgtlmoon
19f3851c9d Memory management improvements - LXML and other libraries can leak allocation, wrap in a sub-process (#2626) 2024-09-11 16:20:49 +02:00
dgtlmoon
7f2fa20318 Small memory allocation fixes (#2625) 2024-09-11 14:51:32 +02:00
dgtlmoon
e16814e40b Testing - locale fix for test (#2623) 2024-09-11 11:31:07 +02:00
dgtlmoon
337fcab3f1 Testing/Code - Improving test reliability (#2617) 2024-09-09 16:50:00 +02:00
dgtlmoon
eaccd6026c UI - Hiding noisy info under 'show advanced help' button (#2609) 2024-09-06 14:33:06 +02:00
dgtlmoon
1a0ac5c952 Not used 2024-09-05 11:26:52 +02:00
dgtlmoon
78f7614a00 tidyup 2024-09-04 18:31:28 +02:00
dgtlmoon
c104a9f426 Store each watch config in its own directory, saves rewriting the whole json file each time 2024-09-04 18:20:04 +02:00
dgtlmoon
5b70625eaa 0.46.04 2024-09-04 13:55:18 +02:00
dgtlmoon
60d292107d Fixing restock monitor tests and tweaking docker default config example, 2024-09-02 15:11:31 +02:00
dgtlmoon
1cb38347da Container name should be 'sockpuppetbrowser' because its not just playwright that uses it 2024-09-02 13:21:38 +02:00
dgtlmoon
55fe2abf42 Restock/Price detection - Better catching of errors when parsing metadata documents for restock/price check (#2602) 2024-09-01 13:07:06 +02:00
dgtlmoon
4225900ec3 Restock - updating texts and text offsets 2024-09-01 12:47:21 +02:00
dgtlmoon
1fb4342488 Build - Unpin jsonschema for faster builds (#2583) 2024-08-22 15:02:00 +02:00
dgtlmoon
7071df061a Price detection/scraping - Adding extra element training data (#2582) 2024-08-22 15:01:36 +02:00
dgtlmoon
6dd1fa2b88 0.46.03 2024-08-19 17:22:13 +02:00
dgtlmoon
371f85d544 Watch 'Download last snapshot' link/button should give last, not first snapshot (#2576) 2024-08-19 17:20:30 +02:00
dgtlmoon
932cf15e1e Price and restock scraping - small price fix scraper (#2575) 2024-08-19 15:47:19 +02:00
Mike Splain
bf0d410d32 Browser Steps UI - Interactive UI wasn't sending headers but was when the check ran (#2551) 2024-08-19 10:21:05 +02:00
dgtlmoon
730f37c7ba Set encoding type for scraper script reader (#2574 #2568) 2024-08-19 09:17:18 +02:00
dgtlmoon
8a35d62e02 Handle zero-byte/empty content responses with "[ ] Empty pages are a change" option, the same as when the HTML doesnt render any useful text (#2530) 2024-07-29 13:27:59 +02:00
dgtlmoon
f527744024 0.46.02 2024-07-27 20:28:04 +02:00
dgtlmoon
71c9b1273c Adding test for #1995 UTF-8 encoding in POST request body and post:// notifications (#2525) 2024-07-27 19:47:03 +02:00
dgtlmoon
ec68450df1 Updating Apprise notification library , Splunk/VictorOps, Africas Talking, Microsoft Power Automate / Workflows, Société Française du Radiotéléphone (SFR) Support (#2524) 2024-07-27 14:28:57 +02:00
dgtlmoon
2fd762a783 Encode POST style requests and notifications as UTF-8 if it has no encoding/basic string (#2523) 2024-07-27 14:27:15 +02:00
Kenny Root
d7e85ffe8f VisualSelector+BrowserSteps - When scraping elements, check for null results (#2517) 2024-07-25 14:44:10 +02:00
Kenny Root
d23a301826 Use #!/usr/bin/env to support virtualenv (#2518) 2024-07-25 14:39:01 +02:00
dgtlmoon
3ce6096fdb Update README.md 2024-07-21 18:18:24 +02:00
dgtlmoon
8acdcdd861 UI - Adding "Download latest HTML snapshot" from Edit Watch > Stats page for easier debugging (#2513) 2024-07-21 18:17:34 +02:00
dgtlmoon
755cba33de 0.46.01 2024-07-19 13:53:00 +02:00
dgtlmoon
8aae7dfae0 UI - Fixing up 'test notification' bug from main settings and tag settings pages #2510 (#2511) 2024-07-19 13:51:15 +02:00
dgtlmoon
ed00f67a80 0.46.00 2024-07-18 14:13:06 +02:00
dgtlmoon
44e7e142f8 Restock/Price detection - Improving text information snapshot value 2024-07-18 13:28:54 +02:00
dgtlmoon
fe704e05a3 Restock - Tweaking storage of "original price" 2024-07-18 13:15:56 +02:00
dgtlmoon
e756e0af5e Fixing file:// file pickup - for change detection of local files (#2505) 2024-07-18 13:05:27 +02:00
dgtlmoon
c0b6c8581e Adding Apple M1 Pro type arm64/v8 support docker image (#2507) 2024-07-18 11:58:34 +02:00
dgtlmoon
de558f208f Dropping older ARM v6 support due to dependencies not having support (#2506) 2024-07-18 10:54:55 +02:00
dgtlmoon
321426dea2 Ability to use restock and price amounts in notifications as tokens (for example {{restock.price}} ) (#2503) 2024-07-17 20:27:47 +02:00
107 changed files with 1362 additions and 487 deletions

View File

@@ -95,7 +95,7 @@ jobs:
push: true
tags: |
${{ secrets.DOCKER_HUB_USERNAME }}/changedetection.io:dev,ghcr.io/${{ github.repository }}:dev
platforms: linux/amd64,linux/arm64,linux/arm/v6,linux/arm/v7,linux/arm/v8
platforms: linux/amd64,linux/arm64,linux/arm/v7,linux/arm/v8,linux/arm64/v8
cache-from: type=gha
cache-to: type=gha,mode=max
@@ -116,7 +116,7 @@ jobs:
ghcr.io/dgtlmoon/changedetection.io:${{ github.event.release.tag_name }}
${{ secrets.DOCKER_HUB_USERNAME }}/changedetection.io:latest
ghcr.io/dgtlmoon/changedetection.io:latest
platforms: linux/amd64,linux/arm64,linux/arm/v6,linux/arm/v7,linux/arm/v8
platforms: linux/amd64,linux/arm64,linux/arm/v7,linux/arm/v8,linux/arm64/v8
cache-from: type=gha
cache-to: type=gha,mode=max
# Looks like this was disabled

View File

@@ -64,7 +64,7 @@ jobs:
with:
context: ./
file: ./Dockerfile
platforms: linux/amd64,linux/arm64,linux/arm/v6,linux/arm/v7,linux/arm/v8
platforms: linux/amd64,linux/arm64,linux/arm/v7,linux/arm/v8,linux/arm64/v8
cache-from: type=local,src=/tmp/.buildx-cache
cache-to: type=local,dest=/tmp/.buildx-cache

View File

@@ -41,6 +41,20 @@ Using the **Browser Steps** configuration, add basic steps before performing cha
After **Browser Steps** have been run, then visit the **Visual Selector** tab to refine the content you're interested in.
Requires Playwright to be enabled.
### Awesome restock and price change notifications
Enable the _"Re-stock & Price detection for single product pages"_ option to activate the best way to monitor product pricing, this will extract any meta-data in the HTML page and give you many options to follow the pricing of the product.
Easily organise and monitor prices for products from the dashboard, get alerts and notifications when the price of a product changes or comes back in stock again!
[<img src="docs/restock-overview.png" style="max-width:100%;" alt="Easily keep an eye on product price changes directly from the UI" title="Easily keep an eye on product price changes directly from the UI" />](https://changedetection.io?src=github)
Set price change notification parameters, upper and lower price, price change percentage and more.
Always know when a product for sale drops in price.
[<img src="docs/restock-settings.png" style="max-width:100%;" alt="Set upper lower and percentage price change notification values" title="Set upper lower and percentage price change notification values" />](https://changedetection.io?src=github)
### Example use cases

View File

@@ -1,4 +1,4 @@
#!/usr/bin/python3
#!/usr/bin/env python3
# Only exists for direct CLI usage

View File

@@ -1,8 +1,8 @@
#!/usr/bin/python3
#!/usr/bin/env python3
# Read more https://github.com/dgtlmoon/changedetection.io/wiki
__version__ = '0.45.26'
__version__ = '0.46.04'
from changedetectionio.strtobool import strtobool
from json.decoder import JSONDecodeError
@@ -43,7 +43,6 @@ def main():
global app
datastore_path = None
do_cleanup = False
host = ''
ipv6_enabled = False
port = os.environ.get('PORT') or 5000
@@ -91,10 +90,6 @@ def main():
logger.success("Enabling IPv6 listen support")
ipv6_enabled = True
# Cleanup (remove text files that arent in the index)
if opt == '-c':
do_cleanup = True
# Create the datadir if it doesnt exist
if opt == '-C':
create_datastore_dir = True
@@ -145,10 +140,6 @@ def main():
signal.signal(signal.SIGTERM, sigshutdown_handler)
signal.signal(signal.SIGINT, sigshutdown_handler)
# Go into cleanup mode
if do_cleanup:
datastore.remove_unused_snapshots()
app.config['datastore_path'] = datastore_path

View File

@@ -0,0 +1,78 @@
# include the decorator
from apprise.decorators import notify
@notify(on="delete")
@notify(on="deletes")
@notify(on="get")
@notify(on="gets")
@notify(on="post")
@notify(on="posts")
@notify(on="put")
@notify(on="puts")
def apprise_custom_api_call_wrapper(body, title, notify_type, *args, **kwargs):
import requests
import json
from apprise.utils import parse_url as apprise_parse_url
from apprise import URLBase
url = kwargs['meta'].get('url')
if url.startswith('post'):
r = requests.post
elif url.startswith('get'):
r = requests.get
elif url.startswith('put'):
r = requests.put
elif url.startswith('delete'):
r = requests.delete
url = url.replace('post://', 'http://')
url = url.replace('posts://', 'https://')
url = url.replace('put://', 'http://')
url = url.replace('puts://', 'https://')
url = url.replace('get://', 'http://')
url = url.replace('gets://', 'https://')
url = url.replace('put://', 'http://')
url = url.replace('puts://', 'https://')
url = url.replace('delete://', 'http://')
url = url.replace('deletes://', 'https://')
headers = {}
params = {}
auth = None
# Convert /foobar?+some-header=hello to proper header dictionary
results = apprise_parse_url(url)
if results:
# Add our headers that the user can potentially over-ride if they wish
# to to our returned result set and tidy entries by unquoting them
headers = {URLBase.unquote(x): URLBase.unquote(y)
for x, y in results['qsd+'].items()}
# https://github.com/caronc/apprise/wiki/Notify_Custom_JSON#get-parameter-manipulation
# In Apprise, it relies on prefixing each request arg with "-", because it uses say &method=update as a flag for apprise
# but here we are making straight requests, so we need todo convert this against apprise's logic
for k, v in results['qsd'].items():
if not k.strip('+-') in results['qsd+'].keys():
params[URLBase.unquote(k)] = URLBase.unquote(v)
# Determine Authentication
auth = ''
if results.get('user') and results.get('password'):
auth = (URLBase.unquote(results.get('user')), URLBase.unquote(results.get('user')))
elif results.get('user'):
auth = (URLBase.unquote(results.get('user')))
# Try to auto-guess if it's JSON
try:
json.loads(body)
headers['Content-Type'] = 'application/json; charset=utf-8'
except ValueError as e:
pass
r(results.get('url'),
auth=auth,
data=body.encode('utf-8') if type(body) is str else body,
headers=headers,
params=params
)

View File

@@ -85,7 +85,8 @@ def construct_blueprint(datastore: ChangeDetectionStore):
browsersteps_start_session['browserstepper'] = browser_steps.browsersteps_live_ui(
playwright_browser=browsersteps_start_session['browser'],
proxy=proxy,
start_url=datastore.data['watching'][watch_uuid].get('url')
start_url=datastore.data['watching'][watch_uuid].get('url'),
headers=datastore.data['watching'][watch_uuid].get('headers')
)
# For test

View File

@@ -1,4 +1,4 @@
#!/usr/bin/python3
#!/usr/bin/env python3
import os
import time
@@ -25,6 +25,7 @@ browser_step_ui_config = {'Choose one': '0 0',
'Click element if exists': '1 0',
'Click element': '1 0',
'Click element containing text': '0 1',
'Click element containing text if exists': '0 1',
'Enter text in field': '1 1',
'Execute JS': '0 1',
# 'Extract text and use as filter': '1 0',
@@ -96,12 +97,24 @@ class steppable_browser_interface():
return self.action_goto_url(value=self.start_url)
def action_click_element_containing_text(self, selector=None, value=''):
logger.debug("Clicking element containing text")
if not len(value.strip()):
return
elem = self.page.get_by_text(value)
if elem.count():
elem.first.click(delay=randint(200, 500), timeout=3000)
def action_click_element_containing_text_if_exists(self, selector=None, value=''):
logger.debug("Clicking element containing text if exists")
if not len(value.strip()):
return
elem = self.page.get_by_text(value)
logger.debug(f"Clicking element containing text - {elem.count()} elements found")
if elem.count():
elem.first.click(delay=randint(200, 500), timeout=3000)
else:
return
def action_enter_text_in_field(self, selector, value):
if not len(selector.strip()):
return

View File

@@ -106,12 +106,14 @@ def construct_blueprint(datastore: ChangeDetectionStore):
form = group_restock_settings_form(formdata=request.form if request.method == 'POST' else None,
data=default,
extra_notification_tokens=datastore.get_unique_notification_tokens_available()
)
template_args = {
'data': default,
'form': form,
'watch': default
'watch': default,
'extra_notification_token_placeholder_info': datastore.get_unique_notification_token_placeholders_available(),
}
included_content = {}
@@ -161,6 +163,7 @@ def construct_blueprint(datastore: ChangeDetectionStore):
form = group_restock_settings_form(formdata=request.form if request.method == 'POST' else None,
data=default,
extra_notification_tokens=datastore.get_unique_notification_tokens_available()
)
# @todo subclass form so validation works
#if not form.validate():

View File

@@ -58,9 +58,9 @@ xpath://body/div/span[contains(@class, 'example-class')]",
{% if '/text()' in field %}
<span class="pure-form-message-inline"><strong>Note!: //text() function does not work where the &lt;element&gt; contains &lt;![CDATA[]]&gt;</strong></span><br>
{% endif %}
<span class="pure-form-message-inline">One rule per line, <i>any</i> rules that matches will be used.<br>
<ul>
<span class="pure-form-message-inline">One CSS, xPath, JSON Path/JQ selector per line, <i>any</i> rules that matches will be used.<br>
<div data-target="#advanced-help-selectors" class="toggle-show pure-button button-tag button-xsmall">Show advanced help and tips</div>
<ul id="advanced-help-selectors">
<li>CSS - Limit text to this CSS rule, only text matching this CSS rule is included.</li>
<li>JSON - Limit text to this JSON rule, using either <a href="https://pypi.org/project/jsonpath-ng/" target="new">JSONPath</a> or <a href="https://stedolan.github.io/jq/" target="new">jq</a> (if installed).
<ul>
@@ -89,11 +89,13 @@ xpath://body/div/span[contains(@class, 'example-class')]",
{{ render_field(form.subtractive_selectors, rows=5, placeholder="header
footer
nav
.stockticker") }}
.stockticker
//*[contains(text(), 'Advertisement')]") }}
<span class="pure-form-message-inline">
<ul>
<li> Remove HTML element(s) by CSS selector before text conversion. </li>
<li> Add multiple elements or CSS selectors per line to ignore multiple parts of the HTML. </li>
<li> Remove HTML element(s) by CSS and XPath selectors before text conversion. </li>
<li> Don't paste HTML here, use only CSS and XPath selectors </li>
<li> Add multiple elements, CSS or XPath selectors per line to ignore multiple parts of the HTML. </li>
</ul>
</span>
</fieldset>
@@ -128,7 +130,7 @@ nav
{% endif %}
<a href="#notifications" id="notification-setting-reset-to-default" class="pure-button button-xsmall" style="right: 20px; top: 20px; position: absolute; background-color: #5f42dd; border-radius: 4px; font-size: 70%; color: #fff">Use system defaults</a>
{{ render_common_settings_form(form, emailprefix, settings_application) }}
{{ render_common_settings_form(form, emailprefix, settings_application, extra_notification_token_placeholder_info) }}
</div>
</fieldset>
</div>

View File

@@ -65,8 +65,8 @@ class Fetcher():
def __init__(self):
import importlib.resources
self.xpath_element_js = importlib.resources.files("changedetectionio.content_fetchers.res").joinpath('xpath_element_scraper.js').read_text()
self.instock_data_js = importlib.resources.files("changedetectionio.content_fetchers.res").joinpath('stock-not-in-stock.js').read_text()
self.xpath_element_js = importlib.resources.files("changedetectionio.content_fetchers.res").joinpath('xpath_element_scraper.js').read_text(encoding='utf-8')
self.instock_data_js = importlib.resources.files("changedetectionio.content_fetchers.res").joinpath('stock-not-in-stock.js').read_text(encoding='utf-8')
@abstractmethod
def get_error(self):
@@ -81,7 +81,8 @@ class Fetcher():
request_method,
ignore_status_codes=False,
current_include_filters=None,
is_binary=False):
is_binary=False,
empty_pages_are_a_change=False):
# Should set self.error, self.status_code and self.content
pass

View File

@@ -83,7 +83,8 @@ class fetcher(Fetcher):
request_method,
ignore_status_codes=False,
current_include_filters=None,
is_binary=False):
is_binary=False,
empty_pages_are_a_change=False):
from playwright.sync_api import sync_playwright
import playwright._impl._errors
@@ -130,7 +131,7 @@ class fetcher(Fetcher):
if response is None:
context.close()
browser.close()
logger.debug("Content Fetcher > Response object was none")
logger.debug("Content Fetcher > Response object from the browser communication was none")
raise EmptyReply(url=url, status_code=None)
try:
@@ -166,10 +167,10 @@ class fetcher(Fetcher):
raise Non200ErrorCodeReceived(url=url, status_code=self.status_code, screenshot=screenshot)
if len(self.page.content().strip()) == 0:
if not empty_pages_are_a_change and len(self.page.content().strip()) == 0:
logger.debug("Content Fetcher > Content was empty, empty_pages_are_a_change = False")
context.close()
browser.close()
logger.debug("Content Fetcher > Content was empty")
raise EmptyReply(url=url, status_code=response.status)
# Run Browser Steps here

View File

@@ -75,7 +75,8 @@ class fetcher(Fetcher):
request_method,
ignore_status_codes,
current_include_filters,
is_binary
is_binary,
empty_pages_are_a_change
):
from changedetectionio.content_fetchers import visualselector_xpath_selectors
@@ -153,7 +154,7 @@ class fetcher(Fetcher):
if response is None:
await self.page.close()
await browser.close()
logger.warning("Content Fetcher > Response object was none")
logger.warning("Content Fetcher > Response object was none (as in, the response from the browser was empty, not just the content)")
raise EmptyReply(url=url, status_code=None)
self.headers = response.headers
@@ -186,10 +187,11 @@ class fetcher(Fetcher):
raise Non200ErrorCodeReceived(url=url, status_code=self.status_code, screenshot=screenshot)
content = await self.page.content
if len(content.strip()) == 0:
if not empty_pages_are_a_change and len(content.strip()) == 0:
logger.error("Content Fetcher > Content was empty (empty_pages_are_a_change is False), closing browsers")
await self.page.close()
await browser.close()
logger.error("Content Fetcher > Content was empty")
raise EmptyReply(url=url, status_code=response.status)
# Run Browser Steps here
@@ -247,7 +249,7 @@ class fetcher(Fetcher):
await self.fetch_page(**kwargs)
def run(self, url, timeout, request_headers, request_body, request_method, ignore_status_codes=False,
current_include_filters=None, is_binary=False):
current_include_filters=None, is_binary=False, empty_pages_are_a_change=False):
#@todo make update_worker async which could run any of these content_fetchers within memory and time constraints
max_time = os.getenv('PUPPETEER_MAX_PROCESSING_TIMEOUT_SECONDS', 180)
@@ -262,7 +264,8 @@ class fetcher(Fetcher):
request_method=request_method,
ignore_status_codes=ignore_status_codes,
current_include_filters=current_include_filters,
is_binary=is_binary
is_binary=is_binary,
empty_pages_are_a_change=empty_pages_are_a_change
), timeout=max_time))
except asyncio.TimeoutError:
raise(BrowserFetchTimedOut(msg=f"Browser connected but was unable to process the page in {max_time} seconds."))

View File

@@ -1,9 +1,7 @@
from loguru import logger
import hashlib
import os
import chardet
import requests
from changedetectionio import strtobool
from changedetectionio.content_fetchers.exceptions import BrowserStepsInUnsupportedFetcher, EmptyReply, Non200ErrorCodeReceived
from changedetectionio.content_fetchers.base import Fetcher
@@ -25,7 +23,11 @@ class fetcher(Fetcher):
request_method,
ignore_status_codes=False,
current_include_filters=None,
is_binary=False):
is_binary=False,
empty_pages_are_a_change=False):
import chardet
import requests
if self.browser_steps_get_valid_steps():
raise BrowserStepsInUnsupportedFetcher(url=url)
@@ -45,13 +47,19 @@ class fetcher(Fetcher):
if self.system_https_proxy:
proxies['https'] = self.system_https_proxy
r = requests.request(method=request_method,
data=request_body,
url=url,
headers=request_headers,
timeout=timeout,
proxies=proxies,
verify=False)
session = requests.Session()
if strtobool(os.getenv('ALLOW_FILE_URI', 'false')) and url.startswith('file://'):
from requests_file import FileAdapter
session.mount('file://', FileAdapter())
r = session.request(method=request_method,
data=request_body.encode('utf-8') if type(request_body) is str else request_body,
url=url,
headers=request_headers,
timeout=timeout,
proxies=proxies,
verify=False)
# If the response did not tell us what encoding format to expect, Then use chardet to override what `requests` thinks.
# For example - some sites don't tell us it's utf-8, but return utf-8 content
@@ -67,7 +75,10 @@ class fetcher(Fetcher):
self.headers = r.headers
if not r.content or not len(r.content):
raise EmptyReply(url=url, status_code=r.status_code)
if not empty_pages_are_a_change:
raise EmptyReply(url=url, status_code=r.status_code)
else:
logger.debug(f"URL {url} gave zero byte content reply with Status Code {r.status_code}, but empty_pages_are_a_change = True")
# @todo test this
# @todo maybe you really want to test zero-byte return pages?

View File

@@ -75,6 +75,7 @@ function isItemInStock() {
'vergriffen',
'vorbestellen',
'vorbestellung ist bald möglich',
'we don\'t currently have any',
'we couldn\'t find any products that match',
'we do not currently have an estimate of when this product will be back in stock.',
'we don\'t know when or if this item will be back in stock.',
@@ -173,7 +174,8 @@ function isItemInStock() {
const element = elementsToScan[i];
// outside the 'fold' or some weird text in the heading area
// .getBoundingClientRect() was causing a crash in chrome 119, can only be run on contentVisibility != hidden
if (element.getBoundingClientRect().top + window.scrollY >= vh + 150 || element.getBoundingClientRect().top + window.scrollY <= 100) {
// Note: theres also an automated test that places the 'out of stock' text fairly low down
if (element.getBoundingClientRect().top + window.scrollY >= vh + 250 || element.getBoundingClientRect().top + window.scrollY <= 100) {
continue
}
elementText = "";
@@ -187,7 +189,7 @@ function isItemInStock() {
// and these mean its out of stock
for (const outOfStockText of outOfStockTexts) {
if (elementText.includes(outOfStockText)) {
console.log(`Selected 'Out of Stock' - found text "${outOfStockText}" - "${elementText}"`)
console.log(`Selected 'Out of Stock' - found text "${outOfStockText}" - "${elementText}" - offset top ${element.getBoundingClientRect().top}, page height is ${vh}`)
return outOfStockText; // item is out of stock
}
}

View File

@@ -164,6 +164,15 @@ visibleElementsArray.forEach(function (element) {
}
}
let label = "not-interesting" // A placeholder, the actual labels for training are done by hand for now
let text = element.textContent.trim().slice(0, 30).trim();
while (/\n{2,}|\t{2,}/.test(text)) {
text = text.replace(/\n{2,}/g, '\n').replace(/\t{2,}/g, '\t')
}
// Try to identify any possible currency amounts "Sale: 4000" or "Sale now 3000 Kc", can help with the training.
const hasDigitCurrency = (/\d/.test(text.slice(0, 6)) || /\d/.test(text.slice(-6)) ) && /([€£$¥₩₹]|USD|AUD|EUR|Kč|kr|SEK|,)/.test(text) ;
size_pos.push({
xpath: xpath_result,
@@ -171,9 +180,16 @@ visibleElementsArray.forEach(function (element) {
height: Math.round(bbox['height']),
left: Math.floor(bbox['left']),
top: Math.floor(bbox['top']) + scroll_y,
// tagName used by Browser Steps
tagName: (element.tagName) ? element.tagName.toLowerCase() : '',
// tagtype used by Browser Steps
tagtype: (element.tagName.toLowerCase() === 'input' && element.type) ? element.type.toLowerCase() : '',
isClickable: window.getComputedStyle(element).cursor == "pointer"
isClickable: window.getComputedStyle(element).cursor === "pointer",
// Used by the keras trainer
fontSize: window.getComputedStyle(element).getPropertyValue('font-size'),
fontWeight: window.getComputedStyle(element).getPropertyValue('font-weight'),
hasDigitCurrency: hasDigitCurrency,
label: label,
});
});
@@ -214,7 +230,7 @@ if (include_filters.length) {
console.log(e);
}
if (results.length) {
if (results != null && results.length) {
// Iterate over the results
results.forEach(node => {

View File

@@ -56,7 +56,8 @@ class fetcher(Fetcher):
request_method,
ignore_status_codes=False,
current_include_filters=None,
is_binary=False):
is_binary=False,
empty_pages_are_a_change=False):
from selenium import webdriver
from selenium.webdriver.chrome.options import Options as ChromeOptions

View File

@@ -1,6 +1,8 @@
#!/usr/bin/python3
#!/usr/bin/env python3
import datetime
import importlib
import flask_login
import locale
import os
@@ -10,7 +12,9 @@ import threading
import time
import timeago
from .content_fetchers.exceptions import ReplyWithContentButNoText
from .processors import find_processors, get_parent_module, get_custom_watch_obj_for_processor
from .processors.text_json_diff.processor import FilterNotFoundInResponse
from .safe_jinja import render as jinja_render
from changedetectionio.strtobool import strtobool
from copy import deepcopy
@@ -532,12 +536,22 @@ def changedetection_app(config=None, datastore_o=None):
@login_optionally_required
def ajax_callback_send_notification_test(watch_uuid=None):
# Watch_uuid could be unsuet in the case its used in tag editor, global setings
# Watch_uuid could be unset in the case its used in tag editor, global setings
import apprise
import random
from .apprise_asset import asset
apobj = apprise.Apprise(asset=asset)
# so that the custom endpoints are registered
from changedetectionio.apprise_plugin import apprise_custom_api_call_wrapper
is_global_settings_form = request.args.get('mode', '') == 'global-settings'
is_group_settings_form = request.args.get('mode', '') == 'group-settings'
watch = datastore.data['watching'].get(watch_uuid) if watch_uuid else None
# Use an existing random one on the global/main settings form
if not watch_uuid and (is_global_settings_form or is_group_settings_form):
logger.debug(f"Send test notification - Choosing random Watch {watch_uuid}")
watch_uuid = random.choice(list(datastore.data['watching'].keys()))
watch = datastore.data['watching'].get(watch_uuid)
notification_urls = request.form['notification_urls'].strip().splitlines()
@@ -549,8 +563,6 @@ def changedetection_app(config=None, datastore_o=None):
tag = datastore.tag_exists_by_name(k.strip())
notification_urls = tag.get('notifications_urls') if tag and tag.get('notifications_urls') else None
is_global_settings_form = request.args.get('mode', '') == 'global-settings'
is_group_settings_form = request.args.get('mode', '') == 'group-settings'
if not notification_urls and not is_global_settings_form and not is_group_settings_form:
# In the global settings, use only what is typed currently in the text box
logger.debug("Test notification - Trying by global system settings notifications")
@@ -569,7 +581,7 @@ def changedetection_app(config=None, datastore_o=None):
try:
# use the same as when it is triggered, but then override it with the form test values
n_object = {
'watch_url': request.form['window_url'],
'watch_url': request.form.get('window_url', "https://changedetection.io"),
'notification_urls': notification_urls
}
@@ -696,8 +708,9 @@ def changedetection_app(config=None, datastore_o=None):
form_class = forms.processor_text_json_diff_form
form = form_class(formdata=request.form if request.method == 'POST' else None,
data=default
)
data=default,
extra_notification_tokens=default.extra_notification_token_values()
)
# For the form widget tag UUID back to "string name" for the field
form.tags.datastore = datastore
@@ -824,6 +837,7 @@ def changedetection_app(config=None, datastore_o=None):
'emailprefix': os.getenv('NOTIFICATION_MAIL_BUTTON_PREFIX', False),
'extra_title': f" - Edit - {watch.label}",
'extra_processor_config': form.extra_tab_content(),
'extra_notification_token_placeholder_info': datastore.get_unique_notification_token_placeholders_available(),
'form': form,
'has_default_notification_urls': True if len(datastore.data['settings']['application']['notification_urls']) else False,
'has_extra_headers_file': len(datastore.get_all_headers_in_textfile_for_watch(uuid=uuid)) > 0,
@@ -878,7 +892,8 @@ def changedetection_app(config=None, datastore_o=None):
# Don't use form.data on POST so that it doesnt overrid the checkbox status from the POST status
form = forms.globalSettingsForm(formdata=request.form if request.method == 'POST' else None,
data=default
data=default,
extra_notification_tokens=datastore.get_unique_notification_tokens_available()
)
# Remove the last option 'System default'
@@ -930,6 +945,7 @@ def changedetection_app(config=None, datastore_o=None):
output = render_template("settings.html",
api_key=datastore.data['settings']['application'].get('api_access_token'),
emailprefix=os.getenv('NOTIFICATION_MAIL_BUTTON_PREFIX', False),
extra_notification_token_placeholder_info=datastore.get_unique_notification_token_placeholders_available(),
form=form,
hide_remove_pass=os.getenv("SALTED_PASS", False),
min_system_recheck_seconds=int(os.getenv('MINIMUM_SECONDS_RECHECK_TIME', 3)),
@@ -1358,6 +1374,83 @@ def changedetection_app(config=None, datastore_o=None):
except FileNotFoundError:
abort(404)
@app.route("/edit/<string:uuid>/get-html", methods=['GET'])
@login_optionally_required
def watch_get_latest_html(uuid):
from io import BytesIO
from flask import send_file
import brotli
watch = datastore.data['watching'].get(uuid)
if watch and watch.history.keys() and os.path.isdir(watch.watch_data_dir):
latest_filename = list(watch.history.keys())[-1]
html_fname = os.path.join(watch.watch_data_dir, f"{latest_filename}.html.br")
with open(html_fname, 'rb') as f:
if html_fname.endswith('.br'):
# Read and decompress the Brotli file
decompressed_data = brotli.decompress(f.read())
else:
decompressed_data = f.read()
buffer = BytesIO(decompressed_data)
return send_file(buffer, as_attachment=True, download_name=f"{latest_filename}.html", mimetype='text/html')
# Return a 500 error
abort(500)
@app.route("/edit/<string:uuid>/preview-rendered", methods=['POST'])
@login_optionally_required
def watch_get_preview_rendered(uuid):
'''For when viewing the "preview" of the rendered text from inside of Edit'''
now = time.time()
import brotli
from . import forms
text_after_filter = ''
tmp_watch = deepcopy(datastore.data['watching'].get(uuid))
if tmp_watch and tmp_watch.history and os.path.isdir(tmp_watch.watch_data_dir):
# Splice in the temporary stuff from the form
form = forms.processor_text_json_diff_form(formdata=request.form if request.method == 'POST' else None,
data=request.form
)
# Only update vars that came in via the AJAX post
p = {k: v for k, v in form.data.items() if k in request.form.keys()}
tmp_watch.update(p)
latest_filename = next(reversed(tmp_watch.history))
html_fname = os.path.join(tmp_watch.watch_data_dir, f"{latest_filename}.html.br")
with open(html_fname, 'rb') as f:
decompressed_data = brotli.decompress(f.read()).decode('utf-8') if html_fname.endswith('.br') else f.read().decode('utf-8')
# Just like a normal change detection except provide a fake "watch" object and dont call .call_browser()
processor_module = importlib.import_module("changedetectionio.processors.text_json_diff.processor")
update_handler = processor_module.perform_site_check(datastore=datastore,
watch_uuid=uuid # probably not needed anymore anyway?
)
# Use the last loaded HTML as the input
update_handler.fetcher.content = decompressed_data
try:
changed_detected, update_obj, contents, text_after_filter = update_handler.run_changedetection(
watch=tmp_watch,
skip_when_checksum_same=False,
)
except FilterNotFoundInResponse as e:
text_after_filter = f"Filter not found in HTML: {str(e)}"
except ReplyWithContentButNoText as e:
text_after_filter = f"Filter found but no text (empty result)"
except Exception as e:
text_after_filter = f"Error: {str(e)}"
if not text_after_filter.strip():
text_after_filter = 'Empty content'
logger.trace(f"Parsed in {time.time()-now:.3f}s")
return text_after_filter.strip()
@app.route("/form/add/quickwatch", methods=['POST'])
@login_optionally_required
def form_quick_watch_add():

View File

@@ -221,7 +221,8 @@ class ValidateAppRiseServers(object):
def __call__(self, form, field):
import apprise
apobj = apprise.Apprise()
# so that the custom endpoints are registered
from changedetectionio.apprise_plugin import apprise_custom_api_call_wrapper
for server_url in field.data:
if not apobj.add(server_url):
message = field.gettext('\'%s\' is not a valid AppRise URL.' % (server_url))
@@ -231,9 +232,6 @@ class ValidateJinja2Template(object):
"""
Validates that a {token} is from a valid set
"""
def __init__(self, message=None):
self.message = message
def __call__(self, form, field):
from changedetectionio import notification
@@ -248,6 +246,10 @@ class ValidateJinja2Template(object):
try:
jinja2_env = ImmutableSandboxedEnvironment(loader=BaseLoader)
jinja2_env.globals.update(notification.valid_tokens)
# Extra validation tokens provided on the form_class(... extra_tokens={}) setup
if hasattr(field, 'extra_notification_tokens'):
jinja2_env.globals.update(field.extra_notification_tokens)
jinja2_env.from_string(joined_data).render()
except TemplateSyntaxError as e:
raise ValidationError(f"This is not a valid Jinja2 template: {e}") from e
@@ -422,6 +424,12 @@ class quickWatchForm(Form):
class commonSettingsForm(Form):
from . import processors
def __init__(self, formdata=None, obj=None, prefix="", data=None, meta=None, **kwargs):
super().__init__(formdata, obj, prefix, data, meta, **kwargs)
self.notification_body.extra_notification_tokens = kwargs.get('extra_notification_tokens', {})
self.notification_title.extra_notification_tokens = kwargs.get('extra_notification_tokens', {})
self.notification_urls.extra_notification_tokens = kwargs.get('extra_notification_tokens', {})
extract_title_as_title = BooleanField('Extract <title> from document and use as watch title', default=False)
fetch_backend = RadioField(u'Fetch Method', choices=content_fetchers.available_fetchers(), validators=[ValidateContentFetcherIsReady()])
notification_body = TextAreaField('Notification Body', default='{{ watch_url }} had a change.', validators=[validators.Optional(), ValidateJinja2Template()])
@@ -429,8 +437,8 @@ class commonSettingsForm(Form):
notification_title = StringField('Notification Title', default='ChangeDetection.io Notification - {{ watch_url }}', validators=[validators.Optional(), ValidateJinja2Template()])
notification_urls = StringListField('Notification URL List', validators=[validators.Optional(), ValidateAppRiseServers(), ValidateJinja2Template()])
processor = RadioField( label=u"Processor - What do you want to achieve?", choices=processors.available_processors(), default="text_json_diff")
webdriver_delay = IntegerField('Wait seconds before extracting text', validators=[validators.Optional(), validators.NumberRange(min=1,
message="Should contain one or more seconds")])
webdriver_delay = IntegerField('Wait seconds before extracting text', validators=[validators.Optional(), validators.NumberRange(min=1, message="Should contain one or more seconds")])
class importForm(Form):
from . import processors
@@ -461,7 +469,7 @@ class processor_text_json_diff_form(commonSettingsForm):
include_filters = StringListField('CSS/JSONPath/JQ/XPath Filters', [ValidateCSSJSONXPATHInput()], default='')
subtractive_selectors = StringListField('Remove elements', [ValidateCSSJSONXPATHInput(allow_xpath=False, allow_json=False)])
subtractive_selectors = StringListField('Remove elements', [ValidateCSSJSONXPATHInput(allow_json=False)])
extract_text = StringListField('Extract text', [ValidateListRegex()])
@@ -472,8 +480,10 @@ class processor_text_json_diff_form(commonSettingsForm):
body = TextAreaField('Request body', [validators.Optional()])
method = SelectField('Request method', choices=valid_method, default=default_method)
ignore_status_codes = BooleanField('Ignore status codes (process non-2xx status codes as normal)', default=False)
check_unique_lines = BooleanField('Only trigger when unique lines appear', default=False)
check_unique_lines = BooleanField('Only trigger when unique lines appear in all history', default=False)
remove_duplicate_lines = BooleanField('Remove duplicate lines of text', default=False)
sort_text_alphabetically = BooleanField('Sort text alphabetically', default=False)
trim_text_whitespace = BooleanField('Trim whitespace before and after text', default=False)
filter_text_added = BooleanField('Added lines', default=True)
filter_text_replaced = BooleanField('Replaced/changed lines', default=True)
@@ -568,7 +578,7 @@ class globalSettingsApplicationForm(commonSettingsForm):
empty_pages_are_a_change = BooleanField('Treat empty pages as a change?', default=False)
fetch_backend = RadioField('Fetch Method', default="html_requests", choices=content_fetchers.available_fetchers(), validators=[ValidateContentFetcherIsReady()])
global_ignore_text = StringListField('Ignore Text', [ValidateListRegex()])
global_subtractive_selectors = StringListField('Remove elements', [ValidateCSSJSONXPATHInput(allow_xpath=False, allow_json=False)])
global_subtractive_selectors = StringListField('Remove elements', [ValidateCSSJSONXPATHInput(allow_json=False)])
ignore_whitespace = BooleanField('Ignore whitespace')
password = SaltyPasswordField()
pager_size = IntegerField('Pager size',
@@ -590,6 +600,11 @@ class globalSettingsForm(Form):
# Define these as FormFields/"sub forms", this way it matches the JSON storage
# datastore.data['settings']['application']..
# datastore.data['settings']['requests']..
def __init__(self, formdata=None, obj=None, prefix="", data=None, meta=None, **kwargs):
super().__init__(formdata, obj, prefix, data, meta, **kwargs)
self.application.notification_body.extra_notification_tokens = kwargs.get('extra_notification_tokens', {})
self.application.notification_title.extra_notification_tokens = kwargs.get('extra_notification_tokens', {})
self.application.notification_urls.extra_notification_tokens = kwargs.get('extra_notification_tokens', {})
requests = FormField(globalSettingsRequestForm)
application = FormField(globalSettingsApplicationForm)

View File

@@ -1,10 +1,5 @@
from bs4 import BeautifulSoup
from inscriptis import get_text
from jsonpath_ng.ext import parse
from typing import List
from inscriptis.model.config import ParserConfig
from xml.sax.saxutils import escape as xml_escape
from lxml import etree
import json
import re
@@ -39,6 +34,7 @@ def perl_style_slash_enclosed_regex_to_options(regex):
# Given a CSS Rule, and a blob of HTML, return the blob of HTML that matches
def include_filters(include_filters, html_content, append_pretty_line_formatting=False):
from bs4 import BeautifulSoup
soup = BeautifulSoup(html_content, "html.parser")
html_block = ""
r = soup.select(include_filters, separator="")
@@ -56,16 +52,32 @@ def include_filters(include_filters, html_content, append_pretty_line_formatting
return html_block
def subtractive_css_selector(css_selector, html_content):
from bs4 import BeautifulSoup
soup = BeautifulSoup(html_content, "html.parser")
for item in soup.select(css_selector):
item.decompose()
return str(soup)
def subtractive_xpath_selector(xpath_selector, html_content):
html_tree = etree.HTML(html_content)
elements_to_remove = html_tree.xpath(xpath_selector)
for element in elements_to_remove:
element.getparent().remove(element)
modified_html = etree.tostring(html_tree, method="html").decode("utf-8")
return modified_html
def element_removal(selectors: List[str], html_content):
"""Joins individual filters into one css filter."""
selector = ",".join(selectors)
return subtractive_css_selector(selector, html_content)
"""Removes elements that match a list of CSS or xPath selectors."""
modified_html = html_content
for selector in selectors:
if selector.startswith(('xpath:', 'xpath1:', '//')):
xpath_selector = selector.removeprefix('xpath:').removeprefix('xpath1:')
modified_html = subtractive_xpath_selector(xpath_selector, modified_html)
else:
modified_html = subtractive_css_selector(selector, modified_html)
return modified_html
def elementpath_tostring(obj):
"""
@@ -181,6 +193,7 @@ def xpath1_filter(xpath_filter, html_content, append_pretty_line_formatting=Fals
# Extract/find element
def extract_element(find='title', html_content=''):
from bs4 import BeautifulSoup
#Re #106, be sure to handle when its not found
element_text = None
@@ -194,6 +207,8 @@ def extract_element(find='title', html_content=''):
#
def _parse_json(json_data, json_filter):
from jsonpath_ng.ext import parse
if json_filter.startswith("json:"):
jsonpath_expression = parse(json_filter.replace('json:', ''))
match = jsonpath_expression.find(json_data)
@@ -242,6 +257,8 @@ def _get_stripped_text_from_json_match(match):
# json_filter - ie json:$..price
# ensure_is_ldjson_info_type - str "product", optional, "@type == product" (I dont know how to do that as a json selector)
def extract_json_as_string(content, json_filter, ensure_is_ldjson_info_type=None):
from bs4 import BeautifulSoup
stripped_text_from_html = False
# https://github.com/dgtlmoon/changedetection.io/pull/2041#issuecomment-1848397161w
# Try to parse/filter out the JSON, if we get some parser error, then maybe it's embedded within HTML tags
@@ -352,6 +369,7 @@ def strip_ignore_text(content, wordlist, mode="content"):
return "\n".encode('utf8').join(output)
def cdata_in_document_to_text(html_content: str, render_anchor_tag_content=False) -> str:
from xml.sax.saxutils import escape as xml_escape
pattern = '<!\[CDATA\[(\s*(?:.(?<!\]\]>)\s*)*)\]\]>'
def repl(m):
text = m.group(1)
@@ -360,6 +378,9 @@ def cdata_in_document_to_text(html_content: str, render_anchor_tag_content=False
return re.sub(pattern, repl, html_content)
def html_to_text(html_content: str, render_anchor_tag_content=False, is_rss=False) -> str:
from inscriptis import get_text
from inscriptis.model.config import ParserConfig
"""Converts html string to a string with just the text. If ignoring
rendering anchor tag content is enable, anchor tag content are also
included in the text

View File

@@ -91,8 +91,10 @@ class model(watch_base):
def clear_watch(self):
import pathlib
# JSON Data, Screenshots, Textfiles (history index and snapshots), HTML in the future etc
# Screenshots, Textfiles (history index and snapshots), HTML in the future etc
for item in pathlib.Path(str(self.watch_data_dir)).rglob("*.*"):
if str(item).endswith('.json'):
continue
os.unlink(item)
# Force the attr to recalculate
@@ -432,6 +434,17 @@ class model(watch_base):
def toggle_mute(self):
self['notification_muted'] ^= True
def extra_notification_token_values(self):
# Used for providing extra tokens
# return {'widget': 555}
return {}
def extra_notification_token_placeholder_info(self):
# Used for providing extra tokens
# return [('widget', "Get widget amounts")]
return []
def extract_regex_from_all_history(self, regex):
import csv
import re

View File

@@ -60,6 +60,8 @@ class watch_base(dict):
'time_between_check_use_default': True,
'title': None,
'track_ldjson_price_data': None,
'trim_text_whitespace': False,
'remove_duplicate_lines': False,
'trigger_text': [], # List of text or regex to wait for until a change is detected
'url': '',
'uuid': str(uuid.uuid4()),

View File

@@ -1,9 +1,10 @@
import apprise
import time
from apprise import NotifyFormat
import json
import apprise
from loguru import logger
valid_tokens = {
'base_url': '',
'current_snapshot': '',
@@ -34,86 +35,11 @@ valid_notification_formats = {
default_notification_format_for_watch: default_notification_format_for_watch
}
# include the decorator
from apprise.decorators import notify
@notify(on="delete")
@notify(on="deletes")
@notify(on="get")
@notify(on="gets")
@notify(on="post")
@notify(on="posts")
@notify(on="put")
@notify(on="puts")
def apprise_custom_api_call_wrapper(body, title, notify_type, *args, **kwargs):
import requests
from apprise.utils import parse_url as apprise_parse_url
from apprise import URLBase
url = kwargs['meta'].get('url')
if url.startswith('post'):
r = requests.post
elif url.startswith('get'):
r = requests.get
elif url.startswith('put'):
r = requests.put
elif url.startswith('delete'):
r = requests.delete
url = url.replace('post://', 'http://')
url = url.replace('posts://', 'https://')
url = url.replace('put://', 'http://')
url = url.replace('puts://', 'https://')
url = url.replace('get://', 'http://')
url = url.replace('gets://', 'https://')
url = url.replace('put://', 'http://')
url = url.replace('puts://', 'https://')
url = url.replace('delete://', 'http://')
url = url.replace('deletes://', 'https://')
headers = {}
params = {}
auth = None
# Convert /foobar?+some-header=hello to proper header dictionary
results = apprise_parse_url(url)
if results:
# Add our headers that the user can potentially over-ride if they wish
# to to our returned result set and tidy entries by unquoting them
headers = {URLBase.unquote(x): URLBase.unquote(y)
for x, y in results['qsd+'].items()}
# https://github.com/caronc/apprise/wiki/Notify_Custom_JSON#get-parameter-manipulation
# In Apprise, it relies on prefixing each request arg with "-", because it uses say &method=update as a flag for apprise
# but here we are making straight requests, so we need todo convert this against apprise's logic
for k, v in results['qsd'].items():
if not k.strip('+-') in results['qsd+'].keys():
params[URLBase.unquote(k)] = URLBase.unquote(v)
# Determine Authentication
auth = ''
if results.get('user') and results.get('password'):
auth = (URLBase.unquote(results.get('user')), URLBase.unquote(results.get('user')))
elif results.get('user'):
auth = (URLBase.unquote(results.get('user')))
# Try to auto-guess if it's JSON
try:
json.loads(body)
headers['Content-Type'] = 'application/json; charset=utf-8'
except ValueError as e:
pass
r(results.get('url'),
auth=auth,
data=body,
headers=headers,
params=params
)
def process_notification(n_object, datastore):
# so that the custom endpoints are registered
from changedetectionio.apprise_plugin import apprise_custom_api_call_wrapper
from .safe_jinja import render as jinja_render
now = time.time()
@@ -157,7 +83,7 @@ def process_notification(n_object, datastore):
logger.warning(f"Process Notification: skipping empty notification URL.")
continue
logger.info(">> Process Notification: AppRise notifying {}".format(url))
logger.info(f">> Process Notification: AppRise notifying {url}")
url = jinja_render(template_str=url, **notification_parameters)
# Re 323 - Limit discord length to their 2000 char limit total or it wont send.
@@ -230,6 +156,7 @@ def process_notification(n_object, datastore):
log_value = logs.getvalue()
if log_value and 'WARNING' in log_value or 'ERROR' in log_value:
logger.critical(log_value)
raise Exception(log_value)
# Return what was sent for better logging - after the for loop
@@ -272,19 +199,18 @@ def create_notification_parameters(n_object, datastore):
tokens.update(
{
'base_url': base_url,
'current_snapshot': n_object.get('current_snapshot', ''),
'diff': n_object.get('diff', ''), # Null default in the case we use a test
'diff_added': n_object.get('diff_added', ''), # Null default in the case we use a test
'diff_full': n_object.get('diff_full', ''), # Null default in the case we use a test
'diff_patch': n_object.get('diff_patch', ''), # Null default in the case we use a test
'diff_removed': n_object.get('diff_removed', ''), # Null default in the case we use a test
'diff_url': diff_url,
'preview_url': preview_url,
'triggered_text': n_object.get('triggered_text', ''),
'watch_tag': watch_tag if watch_tag is not None else '',
'watch_title': watch_title if watch_title is not None else '',
'watch_url': watch_url,
'watch_uuid': uuid,
})
# n_object will contain diff, diff_added etc etc
tokens.update(n_object)
if uuid:
tokens.update(datastore.data['watching'].get(uuid).extra_notification_token_values())
return tokens

View File

@@ -1,4 +1,6 @@
from abc import abstractmethod
from changedetectionio.content_fetchers.base import Fetcher
from changedetectionio.strtobool import strtobool
from copy import deepcopy
@@ -23,9 +25,12 @@ class difference_detection_processor():
super().__init__(*args, **kwargs)
self.datastore = datastore
self.watch = deepcopy(self.datastore.data['watching'].get(watch_uuid))
# Generic fetcher that should be extended (requests, playwright etc)
self.fetcher = Fetcher()
def call_browser(self):
from requests.structures import CaseInsensitiveDict
# Protect against file:// access
if re.search(r'^file://', self.watch.get('url', '').strip(), re.IGNORECASE):
if not strtobool(os.getenv('ALLOW_FILE_URI', 'false')):
@@ -133,8 +138,18 @@ class difference_detection_processor():
is_binary = self.watch.is_pdf
# And here we go! call the right browser with browser-specific settings
self.fetcher.run(url, timeout, request_headers, request_body, request_method, ignore_status_codes, self.watch.get('include_filters'),
is_binary=is_binary)
empty_pages_are_a_change = self.datastore.data['settings']['application'].get('empty_pages_are_a_change', False)
self.fetcher.run(url=url,
timeout=timeout,
request_headers=request_headers,
request_body=request_body,
request_method=request_method,
ignore_status_codes=ignore_status_codes,
current_include_filters=self.watch.get('include_filters'),
is_binary=is_binary,
empty_pages_are_a_change=empty_pages_are_a_change
)
#@todo .quit here could go on close object, so we can run JS if change-detected
self.fetcher.quit()
@@ -147,7 +162,7 @@ class difference_detection_processor():
some_data = 'xxxxx'
update_obj["previous_md5"] = hashlib.md5(some_data.encode('utf-8')).hexdigest()
changed_detected = False
return changed_detected, update_obj, ''.encode('utf-8')
return changed_detected, update_obj, ''.encode('utf-8'), b''
def find_sub_packages(package_name):

View File

@@ -1,11 +1,12 @@
from changedetectionio.model.Watch import model as BaseWatch
import re
from babel.numbers import parse_decimal
from changedetectionio.model.Watch import model as BaseWatch
from typing import Union
import re
class Restock(dict):
def parse_currency(self, raw_value: str) -> float:
def parse_currency(self, raw_value: str) -> Union[float, None]:
# Clean and standardize the value (ie 1,400.00 should be 1400.00), even better would be store the whole thing as an integer.
standardized_value = raw_value
@@ -21,8 +22,11 @@ class Restock(dict):
# Remove any non-numeric characters except for the decimal point
standardized_value = re.sub(r'[^\d.-]', '', standardized_value)
# Convert to float
return float(parse_decimal(standardized_value, locale='en'))
if standardized_value:
# Convert to float
return float(parse_decimal(standardized_value, locale='en'))
return None
def __init__(self, *args, **kwargs):
# Define default values
@@ -45,13 +49,10 @@ class Restock(dict):
def __setitem__(self, key, value):
# Custom logic to handle setting price and original_price
if key == 'price':
if key == 'price' or key == 'original_price':
if isinstance(value, str):
value = self.parse_currency(raw_value=value)
if value and not self.get('original_price'):
self['original_price'] = value
super().__setitem__(key, value)
class Watch(BaseWatch):
@@ -68,3 +69,16 @@ class Watch(BaseWatch):
super().clear_watch()
self.update({'restock': Restock()})
def extra_notification_token_values(self):
values = super().extra_notification_token_values()
values['restock'] = self.get('restock', {})
return values
def extra_notification_token_placeholder_info(self):
values = super().extra_notification_token_placeholder_info()
values.append(('restock.price', "Price detected"))
values.append(('restock.original_price', "Original price at first check"))
return values

View File

@@ -2,8 +2,7 @@ from .. import difference_detection_processor
from ..exceptions import ProcessorException
from . import Restock
from loguru import logger
import hashlib
import re
import urllib3
import time
@@ -27,6 +26,25 @@ def _search_prop_by_value(matches, value):
if value in prop[0]:
return prop[1] # Yield the desired value and exit the function
def _deduplicate_prices(data):
seen = set()
unique_data = []
for datum in data:
# Convert 'value' to float if it can be a numeric string, otherwise leave it as is
try:
normalized_value = float(datum.value) if isinstance(datum.value, str) and datum.value.replace('.', '', 1).isdigit() else datum.value
except ValueError:
normalized_value = datum.value
# If the normalized value hasn't been seen yet, add it to unique data
if normalized_value not in seen:
unique_data.append(datum)
seen.add(normalized_value)
return unique_data
# should return Restock()
# add casting?
def get_itemprop_availability(html_content) -> Restock:
@@ -36,17 +54,21 @@ def get_itemprop_availability(html_content) -> Restock:
"""
from jsonpath_ng import parse
import re
now = time.time()
import extruct
logger.trace(f"Imported extruct module in {time.time() - now:.3f}s")
value = {}
now = time.time()
# Extruct is very slow, I'm wondering if some ML is going to be faster (800ms on my i7), 'rdfa' seems to be the heaviest.
syntaxes = ['dublincore', 'json-ld', 'microdata', 'microformat', 'opengraph']
try:
data = extruct.extract(html_content, syntaxes=syntaxes)
except Exception as e:
logger.warning(f"Unable to extract data, document parsing with extruct failed with {type(e).__name__} - {str(e)}")
return Restock()
data = extruct.extract(html_content, syntaxes=syntaxes)
logger.trace(f"Extruct basic extract of all metadata done in {time.time() - now:.3f}s")
# First phase, dead simple scanning of anything that looks useful
@@ -57,7 +79,7 @@ def get_itemprop_availability(html_content) -> Restock:
pricecurrency_parse = parse('$..(pricecurrency|currency|priceCurrency )')
availability_parse = parse('$..(availability|Availability)')
price_result = price_parse.find(data)
price_result = _deduplicate_prices(price_parse.find(data))
if price_result:
# Right now, we just support single product items, maybe we will store the whole actual metadata seperately in teh future and
# parse that for the UI?
@@ -119,6 +141,8 @@ class perform_site_check(difference_detection_processor):
xpath_data = None
def run_changedetection(self, watch, skip_when_checksum_same=True):
import hashlib
if not watch:
raise Exception("Watch no longer exists.")
@@ -132,6 +156,20 @@ class perform_site_check(difference_detection_processor):
update_obj['content_type'] = self.fetcher.headers.get('Content-Type', '')
update_obj["last_check_status"] = self.fetcher.get_last_status_code()
# Only try to process restock information (like scraping for keywords) if the page was actually rendered correctly.
# Otherwise it will assume "in stock" because nothing suggesting the opposite was found
from ...html_tools import html_to_text
text = html_to_text(self.fetcher.content)
logger.debug(f"Length of text after conversion: {len(text)}")
if not len(text):
from ...content_fetchers.exceptions import ReplyWithContentButNoText
raise ReplyWithContentButNoText(url=watch.link,
status_code=self.fetcher.get_last_status_code(),
screenshot=self.fetcher.screenshot,
html_content=self.fetcher.content,
xpath_data=self.fetcher.xpath_data
)
# Which restock settings to compare against?
restock_settings = watch.get('restock_settings', {})
@@ -146,7 +184,7 @@ class perform_site_check(difference_detection_processor):
itemprop_availability = {}
try:
itemprop_availability = get_itemprop_availability(html_content=self.fetcher.content)
itemprop_availability = get_itemprop_availability(self.fetcher.content)
except MoreThanOnePriceFound as e:
# Add the real data
raise ProcessorException(message="Cannot run, more than one price detected, this plugin is only for product pages with ONE product, try the content-change detection mode.",
@@ -177,6 +215,11 @@ class perform_site_check(difference_detection_processor):
# Main detection method
fetched_md5 = None
# store original price if not set
if itemprop_availability and itemprop_availability.get('price') and not itemprop_availability.get('original_price'):
itemprop_availability['original_price'] = itemprop_availability.get('price')
update_obj['restock']["original_price"] = itemprop_availability.get('price')
if not self.fetcher.instock_data and not itemprop_availability.get('availability'):
raise ProcessorException(
message=f"Unable to extract restock data for this page unfortunately. (Got code {self.fetcher.get_last_status_code()} from server), no embedded stock information was found and nothing interesting in the text, try using this watch with Chrome.",
@@ -195,7 +238,7 @@ class perform_site_check(difference_detection_processor):
# What we store in the snapshot
price = update_obj.get('restock').get('price') if update_obj.get('restock').get('price') else ""
snapshot_content = f"{update_obj.get('restock').get('in_stock')} - {price}"
snapshot_content = f"In Stock: {update_obj.get('restock').get('in_stock')} - Price: {price}"
# Main detection method
fetched_md5 = hashlib.md5(snapshot_content.encode('utf-8')).hexdigest()
@@ -255,4 +298,4 @@ class perform_site_check(difference_detection_processor):
# Always record the new checksum
update_obj["previous_md5"] = fetched_md5
return changed_detected, update_obj, snapshot_content.encode('utf-8').strip()
return changed_detected, update_obj, snapshot_content.encode('utf-8').strip(), b''

View File

@@ -36,6 +36,7 @@ class PDFToHTMLToolNotFound(ValueError):
class perform_site_check(difference_detection_processor):
def run_changedetection(self, watch, skip_when_checksum_same=True):
changed_detected = False
html_content = ""
screenshot = False # as bytes
@@ -175,13 +176,13 @@ class perform_site_check(difference_detection_processor):
html_content=self.fetcher.content,
append_pretty_line_formatting=not watch.is_source_type_url,
is_rss=is_rss)
elif filter_rule.startswith('xpath1:'):
html_content += html_tools.xpath1_filter(xpath_filter=filter_rule.replace('xpath1:', ''),
html_content=self.fetcher.content,
append_pretty_line_formatting=not watch.is_source_type_url,
is_rss=is_rss)
html_content=self.fetcher.content,
append_pretty_line_formatting=not watch.is_source_type_url,
is_rss=is_rss)
else:
# CSS Filter, extract the HTML that matches and feed that into the existing inscriptis::get_text
html_content += html_tools.include_filters(include_filters=filter_rule,
html_content=self.fetcher.content,
append_pretty_line_formatting=not watch.is_source_type_url)
@@ -197,18 +198,23 @@ class perform_site_check(difference_detection_processor):
else:
# extract text
do_anchor = self.datastore.data["settings"]["application"].get("render_anchor_tag_content", False)
stripped_text_from_html = \
html_tools.html_to_text(
html_content=html_content,
render_anchor_tag_content=do_anchor,
is_rss=is_rss # #1874 activate the <title workaround hack
)
stripped_text_from_html = html_tools.html_to_text(html_content=html_content,
render_anchor_tag_content=do_anchor,
is_rss=is_rss) # 1874 activate the <title workaround hack
if watch.get('sort_text_alphabetically') and stripped_text_from_html:
if watch.get('trim_text_whitespace'):
stripped_text_from_html = '\n'.join(line.strip() for line in stripped_text_from_html.replace("\n\n", "\n").splitlines())
if watch.get('remove_duplicate_lines'):
stripped_text_from_html = '\n'.join(dict.fromkeys(line.strip() for line in stripped_text_from_html.replace("\n\n", "\n").splitlines()))
if watch.get('sort_text_alphabetically'):
# Note: Because a <p>something</p> will add an extra line feed to signify the paragraph gap
# we end up with 'Some text\n\n', sorting will add all those extra \n at the start, so we remove them here.
stripped_text_from_html = stripped_text_from_html.replace('\n\n', '\n')
stripped_text_from_html = '\n'.join( sorted(stripped_text_from_html.splitlines(), key=lambda x: x.lower() ))
stripped_text_from_html = stripped_text_from_html.replace("\n\n", "\n")
stripped_text_from_html = '\n'.join(sorted(stripped_text_from_html.splitlines(), key=lambda x: x.lower()))
# Re #340 - return the content before the 'ignore text' was applied
text_content_before_ignored_filter = stripped_text_from_html.encode('utf-8')
@@ -236,7 +242,7 @@ class perform_site_check(difference_detection_processor):
# We had some content, but no differences were found
# Store our new file as the MD5 so it will trigger in the future
c = hashlib.md5(text_content_before_ignored_filter.translate(None, b'\r\n\t ')).hexdigest()
return False, {'previous_md5': c}, stripped_text_from_html.encode('utf-8')
return False, {'previous_md5': c}, stripped_text_from_html.encode('utf-8'), stripped_text_from_html.encode('utf-8')
else:
stripped_text_from_html = rendered_diff
@@ -290,7 +296,7 @@ class perform_site_check(difference_detection_processor):
for match in res:
regex_matched_output += [match] + [b'\n']
# Now we will only show what the regex matched
##########################################################
stripped_text_from_html = b''
text_content_before_ignored_filter = b''
if regex_matched_output:
@@ -298,6 +304,8 @@ class perform_site_check(difference_detection_processor):
stripped_text_from_html = b''.join(regex_matched_output)
text_content_before_ignored_filter = stripped_text_from_html
# Re #133 - if we should strip whitespaces from triggering the change detected comparison
if self.datastore.data['settings']['application'].get('ignore_whitespace', False):
fetched_md5 = hashlib.md5(stripped_text_from_html.translate(None, b'\r\n\t ')).hexdigest()
@@ -357,4 +365,4 @@ class perform_site_check(difference_detection_processor):
if not watch.get('previous_md5'):
watch['previous_md5'] = fetched_md5
return changed_detected, update_obj, text_content_before_ignored_filter
return changed_detected, update_obj, text_content_before_ignored_filter, stripped_text_from_html

View File

@@ -35,4 +35,8 @@ pytest tests/test_access_control.py
pytest tests/test_notification.py
pytest tests/test_backend.py
pytest tests/test_rss.py
pytest tests/test_unique_lines.py
pytest tests/test_unique_lines.py
# Check file:// will pickup a file when enabled
echo "Hello world" > /tmp/test-file.txt
ALLOW_FILE_URI=yes pytest tests/test_security.py

View File

@@ -16,25 +16,31 @@ echo "---------------------------------- SOCKS5 -------------------"
docker run --network changedet-network \
-v `pwd`/tests/proxy_socks5/proxies.json-example:/app/changedetectionio/test-datastore/proxies.json \
--rm \
-e "FLASK_SERVER_NAME=cdio" \
--hostname cdio \
-e "SOCKSTEST=proxiesjson" \
test-changedetectionio \
bash -c 'cd changedetectionio && pytest tests/proxy_socks5/test_socks5_proxy_sources.py'
bash -c 'cd changedetectionio && pytest --live-server-host=0.0.0.0 --live-server-port=5004 -s tests/proxy_socks5/test_socks5_proxy_sources.py'
# SOCKS5 related - by manually entering in UI
docker run --network changedet-network \
--rm \
-e "FLASK_SERVER_NAME=cdio" \
--hostname cdio \
-e "SOCKSTEST=manual" \
test-changedetectionio \
bash -c 'cd changedetectionio && pytest tests/proxy_socks5/test_socks5_proxy.py'
bash -c 'cd changedetectionio && pytest --live-server-host=0.0.0.0 --live-server-port=5004 -s tests/proxy_socks5/test_socks5_proxy.py'
# SOCKS5 related - test from proxies.json via playwright - NOTE- PLAYWRIGHT DOESNT SUPPORT AUTHENTICATING PROXY
docker run --network changedet-network \
-e "SOCKSTEST=manual-playwright" \
--hostname cdio \
-e "FLASK_SERVER_NAME=cdio" \
-v `pwd`/tests/proxy_socks5/proxies.json-example-noauth:/app/changedetectionio/test-datastore/proxies.json \
-e "PLAYWRIGHT_DRIVER_URL=ws://sockpuppetbrowser:3000" \
--rm \
test-changedetectionio \
bash -c 'cd changedetectionio && pytest tests/proxy_socks5/test_socks5_proxy_sources.py'
bash -c 'cd changedetectionio && pytest --live-server-host=0.0.0.0 --live-server-port=5004 -s tests/proxy_socks5/test_socks5_proxy_sources.py'
echo "socks5 server logs"
docker logs socks5proxy

View File

@@ -18,9 +18,11 @@ $(document).ready(function () {
});
$("#notification-token-toggle").click(function (e) {
$(".toggle-show").click(function (e) {
e.preventDefault();
$('#notification-tokens-info').toggle();
let target = $(this).data('target');
$(target).toggle();
});
});

View File

@@ -12,6 +12,54 @@ function toggleOpacity(checkboxSelector, fieldSelector, inverted) {
checkbox.addEventListener('change', updateOpacity);
}
(function($) {
// Object to store ongoing requests by namespace
const requests = {};
$.abortiveSingularAjax = function(options) {
const namespace = options.namespace || 'default';
// Abort the current request in this namespace if it's still ongoing
if (requests[namespace]) {
requests[namespace].abort();
}
// Start a new AJAX request and store its reference in the correct namespace
requests[namespace] = $.ajax(options);
// Return the current request in case it's needed
return requests[namespace];
};
})(jQuery);
function request_textpreview_update() {
if (!$('body').hasClass('preview-text-enabled')) {
return
}
const data = {};
$('textarea:visible, input:visible').each(function () {
const $element = $(this); // Cache the jQuery object for the current element
const name = $element.attr('name'); // Get the name attribute of the element
data[name] = $element.is(':checkbox') ? ($element.is(':checked') ? $element.val() : undefined) : $element.val();
});
$.abortiveSingularAjax({
type: "POST",
url: preview_text_edit_filters_url,
data: data,
namespace: 'watchEdit'
}).done(function (data) {
$('#filters-and-triggers #text-preview-inner').text(data);
}).fail(function (error) {
if (error.statusText === 'abort') {
console.log('Request was aborted due to a new request being fired.');
} else {
$('#filters-and-triggers #text-preview-inner').text('There was an error communicating with the server.');
}
})
}
$(document).ready(function () {
$('#notification-setting-reset-to-default').click(function (e) {
$('#notification_title').val('');
@@ -27,5 +75,23 @@ $(document).ready(function () {
toggleOpacity('#time_between_check_use_default', '#time_between_check', false);
const vh = Math.max(document.documentElement.clientHeight || 0, window.innerHeight || 0);
$("#text-preview-inner").css('max-height', (vh-300)+"px");
var debounced_request_textpreview_update = request_textpreview_update.debounce(100);
$("#activate-text-preview").click(function (e) {
$(this).fadeOut();
$('body').toggleClass('preview-text-enabled')
request_textpreview_update();
$("#text-preview-refresh").click(function (e) {
request_textpreview_update();
});
$('textarea:visible').on('keyup blur', debounced_request_textpreview_update);
$('input:visible').on('keyup blur change', debounced_request_textpreview_update);
$("#filters-and-triggers-tab").on('click', debounced_request_textpreview_update);
});
});

View File

@@ -40,15 +40,29 @@
}
}
#browser-steps-fieldlist {
height: 100%;
overflow-y: scroll;
}
#browser-steps .flex-wrapper {
display: flex;
flex-flow: row;
height: 70vh;
font-size: 80%;
#browser-steps-ui {
flex-grow: 1; /* Allow it to grow and fill the available space */
flex-shrink: 1; /* Allow it to shrink if needed */
flex-basis: 0; /* Start with 0 base width so it stretches as much as possible */
background-color: #eee;
border-radius: 5px;
}
#browser-steps-fieldlist {
flex-grow: 0; /* Don't allow it to grow */
flex-shrink: 0; /* Don't allow it to shrink */
flex-basis: auto; /* Base width is determined by the content */
max-width: 400px; /* Set a max width to prevent overflow */
padding-left: 1rem;
overflow-y: scroll;
}
}
/* this is duplicate :( */

View File

@@ -0,0 +1,45 @@
body.preview-text-enabled {
#filters-and-triggers > div {
display: flex; /* Establishes Flexbox layout */
gap: 20px; /* Adds space between the columns */
position: relative; /* Ensures the sticky positioning is relative to this parent */
}
/* layout of the page */
#edit-text-filter, #text-preview {
flex: 1; /* Each column takes an equal amount of available space */
align-self: flex-start; /* Aligns the right column to the start, allowing it to maintain its content height */
}
#edit-text-filter {
#pro-tips {
display: none;
}
}
#text-preview {
position: sticky;
top: 25px;
display: block !important;
}
/* actual preview area */
#text-preview-inner {
background: var(--color-grey-900);
border: 1px solid var(--color-grey-600);
padding: 1rem;
color: #333;
font-family: "Courier New", Courier, monospace; /* Sets the font to a monospace type */
font-size: 12px;
overflow-x: scroll;
white-space: pre-wrap; /* Preserves whitespace and line breaks like <pre> */
overflow-wrap: break-word; /* Allows long words to break and wrap to the next line */
}
}
#activate-text-preview {
right: 0;
position: absolute;
z-index: 0;
box-shadow: 1px 1px 4px var(--color-shadow-jump);
}

View File

@@ -12,6 +12,7 @@
@import "parts/_darkmode";
@import "parts/_menu";
@import "parts/_love";
@import "parts/preview_text_filter";
body {
color: var(--color-text);

View File

@@ -46,14 +46,31 @@
#browser_steps li > label {
display: none; }
#browser-steps-fieldlist {
height: 100%;
overflow-y: scroll; }
#browser-steps .flex-wrapper {
display: flex;
flex-flow: row;
height: 70vh; }
height: 70vh;
font-size: 80%; }
#browser-steps .flex-wrapper #browser-steps-ui {
flex-grow: 1;
/* Allow it to grow and fill the available space */
flex-shrink: 1;
/* Allow it to shrink if needed */
flex-basis: 0;
/* Start with 0 base width so it stretches as much as possible */
background-color: #eee;
border-radius: 5px; }
#browser-steps .flex-wrapper #browser-steps-fieldlist {
flex-grow: 0;
/* Don't allow it to grow */
flex-shrink: 0;
/* Don't allow it to shrink */
flex-basis: auto;
/* Base width is determined by the content */
max-width: 400px;
/* Set a max width to prevent overflow */
padding-left: 1rem;
overflow-y: scroll; }
/* this is duplicate :( */
#browsersteps-selector-wrapper {
@@ -411,6 +428,47 @@ html[data-darkmode="true"] #toggle-light-mode .icon-dark {
fill: #ff0000 !important;
transition: all ease 0.3s !important; }
body.preview-text-enabled {
/* layout of the page */
/* actual preview area */ }
body.preview-text-enabled #filters-and-triggers > div {
display: flex;
/* Establishes Flexbox layout */
gap: 20px;
/* Adds space between the columns */
position: relative;
/* Ensures the sticky positioning is relative to this parent */ }
body.preview-text-enabled #edit-text-filter, body.preview-text-enabled #text-preview {
flex: 1;
/* Each column takes an equal amount of available space */
align-self: flex-start;
/* Aligns the right column to the start, allowing it to maintain its content height */ }
body.preview-text-enabled #edit-text-filter #pro-tips {
display: none; }
body.preview-text-enabled #text-preview {
position: sticky;
top: 25px;
display: block !important; }
body.preview-text-enabled #text-preview-inner {
background: var(--color-grey-900);
border: 1px solid var(--color-grey-600);
padding: 1rem;
color: #333;
font-family: "Courier New", Courier, monospace;
/* Sets the font to a monospace type */
font-size: 12px;
overflow-x: scroll;
white-space: pre-wrap;
/* Preserves whitespace and line breaks like <pre> */
overflow-wrap: break-word;
/* Allows long words to break and wrap to the next line */ }
#activate-text-preview {
right: 0;
position: absolute;
z-index: 0;
box-shadow: 1px 1px 4px var(--color-shadow-jump); }
body {
color: var(--color-text);
background: var(--color-background-page);
@@ -1194,11 +1252,9 @@ ul {
color: #fff;
opacity: 0.7; }
.restock-label svg {
vertical-align: middle; }
#chrome-extension-link {
padding: 9px;
border: 1px solid var(--color-grey-800);

View File

@@ -1,3 +1,5 @@
import pathlib
from changedetectionio.strtobool import strtobool
from flask import (
@@ -11,7 +13,6 @@ from threading import Lock
import json
import os
import re
import requests
import secrets
import threading
import time
@@ -61,14 +62,28 @@ class ChangeDetectionStore:
try:
# @todo retest with ", encoding='utf-8'"
logger.debug(f"Scanning for watch.json in data-directory")
self.__data['watching'] = {}
for item in pathlib.Path(datastore_path).glob("*/watch.json"):
try:
logger.trace(f"Found {item}")
with open(item, 'r') as f:
loaded = json.load(f)
if not loaded.get('uuid'):
logger.error(f"Unable to find UUID in {item}")
self.__data['watching'][loaded.get('uuid')] = loaded
except Exception as e:
logger.critical(f"Error reading watch config file {item} - {str(e)}")
# Convert each existing watch back to the Watch.model object
for uuid, watch in self.__data['watching'].items():
self.__data['watching'][uuid] = self.rehydrate_entity(uuid, watch)
logger.info(f"Watching: {uuid} {watch['url']}")
with open(self.json_store_path) as json_file:
from_disk = json.load(json_file)
# @todo isnt there a way todo this dict.update recursively?
# Problem here is if the one on the disk is missing a sub-struct, it wont be present anymore.
if 'watching' in from_disk:
self.__data['watching'].update(from_disk['watching'])
if 'app_guid' in from_disk:
self.__data['app_guid'] = from_disk['app_guid']
@@ -82,11 +97,6 @@ class ChangeDetectionStore:
if 'application' in from_disk['settings']:
self.__data['settings']['application'].update(from_disk['settings']['application'])
# Convert each existing watch back to the Watch.model object
for uuid, watch in self.__data['watching'].items():
self.__data['watching'][uuid] = self.rehydrate_entity(uuid, watch)
logger.info(f"Watching: {uuid} {watch['url']}")
# And for Tags also, should be Restock type because it has extra settings
for uuid, tag in self.__data['settings']['application']['tags'].items():
self.__data['settings']['application']['tags'][uuid] = self.rehydrate_entity(uuid, tag, processor_override='restock_diff')
@@ -270,6 +280,7 @@ class ChangeDetectionStore:
self.needs_write_urgent = True
def add_watch(self, url, tag='', extras=None, tag_uuids=None, write_to_disk_now=True):
import requests
if extras is None:
extras = {}
@@ -382,25 +393,42 @@ class ChangeDetectionStore:
def sync_to_json(self):
logger.info("Saving JSON..")
try:
data = deepcopy(self.__data)
config_data = deepcopy(self.__data)
watch_data = config_data.pop('watching')
except RuntimeError as e:
# Try again in 15 seconds
time.sleep(15)
time.sleep(5)
logger.error(f"! Data changed when writing to JSON, trying again.. {str(e)}")
self.sync_to_json()
return
else:
# Write the main config file for general settings
try:
# Re #286 - First write to a temp file, then confirm it looks OK and rename it
# This is a fairly basic strategy to deal with the case that the file is corrupted,
# system was out of memory, out of RAM etc
with open(self.json_store_path+".tmp", 'w') as json_file:
json.dump(data, json_file, indent=4)
json.dump(config_data, json_file, indent=2)
os.replace(self.json_store_path+".tmp", self.json_store_path)
except Exception as e:
logger.error(f"Error writing JSON!! (Main JSON file save was skipped) : {str(e)}")
for uuid, watch in watch_data.items():
watch_config_path = os.path.join(self.datastore_path, uuid, "watch.json")
# Write the main config file for general settings
try:
# Re #286 - First write to a temp file, then confirm it looks OK and rename it
# This is a fairly basic strategy to deal with the case that the file is corrupted,
# system was out of memory, out of RAM etc
with open(watch_config_path + ".tmp", 'w') as json_file:
json.dump(watch, json_file, indent=2)
logger.trace(f"Saved/written watch {uuid} to {watch_config_path}")
os.replace(watch_config_path + ".tmp", watch_config_path)
except Exception as e:
logger.error(f"Error writing watch config {uuid} to {watch_config_path} JSON!! {str(e)}")
self.needs_write = False
self.needs_write_urgent = False
@@ -432,25 +460,6 @@ class ChangeDetectionStore:
if self.stop_thread or self.needs_write_urgent:
break
# Go through the datastore path and remove any snapshots that are not mentioned in the index
# This usually is not used, but can be handy.
def remove_unused_snapshots(self):
logger.info("Removing snapshots from datastore that are not in the index..")
index=[]
for uuid in self.data['watching']:
for id in self.data['watching'][uuid].history:
index.append(self.data['watching'][uuid].history[str(id)])
import pathlib
# Only in the sub-directories
for uuid in self.data['watching']:
for item in pathlib.Path(self.datastore_path).rglob(uuid+"/*.txt"):
if not str(item) in index:
logger.info(f"Removing {item}")
unlink(item)
@property
def proxy_list(self):
proxy_list = {}
@@ -631,6 +640,33 @@ class ChangeDetectionStore:
return True
return False
def get_unique_notification_tokens_available(self):
# Ask each type of watch if they have any extra notification token to add to the validation
extra_notification_tokens = {}
watch_processors_checked = set()
for watch_uuid, watch in self.__data['watching'].items():
processor = watch.get('processor')
if processor not in watch_processors_checked:
extra_notification_tokens.update(watch.extra_notification_token_values())
watch_processors_checked.add(processor)
return extra_notification_tokens
def get_unique_notification_token_placeholders_available(self):
# The actual description of the tokens, could be combined with get_unique_notification_tokens_available instead of doing this twice
extra_notification_tokens = []
watch_processors_checked = set()
for watch_uuid, watch in self.__data['watching'].items():
processor = watch.get('processor')
if processor not in watch_processors_checked:
extra_notification_tokens+=watch.extra_notification_token_placeholder_info()
watch_processors_checked.add(processor)
return extra_notification_tokens
def get_updates_available(self):
import inspect
updates_available = []

View File

@@ -1,7 +1,7 @@
{% from '_helpers.html' import render_field %}
{% macro render_common_settings_form(form, emailprefix, settings_application) %}
{% macro render_common_settings_form(form, emailprefix, settings_application, extra_notification_token_placeholder_info) %}
<div class="pure-control-group">
{{ render_field(form.notification_urls, rows=5, placeholder="Examples:
Gitter - gitter://token/room
@@ -11,8 +11,11 @@
class="notification-urls" )
}}
<div class="pure-form-message-inline">
<ul>
<li>Use <a target=_new href="https://github.com/caronc/apprise">AppRise URLs</a> for notification to just about any service! <i><a target=_new href="https://github.com/dgtlmoon/changedetection.io/wiki/Notification-configuration-notes">Please read the notification services wiki here for important configuration notes</a></i>.</li>
<p>
<strong>Tip:</strong> Use <a target=_new href="https://github.com/caronc/apprise">AppRise Notification URLs</a> for notification to just about any service! <i><a target=_new href="https://github.com/dgtlmoon/changedetection.io/wiki/Notification-configuration-notes">Please read the notification services wiki here for important configuration notes</a></i>.<br>
</p>
<div data-target="#advanced-help-notifications" class="toggle-show pure-button button-tag button-xsmall">Show advanced help and tips</div>
<ul style="display: none" id="advanced-help-notifications">
<li><code><a target=_new href="https://github.com/caronc/apprise/wiki/Notify_discord">discord://</a></code> (or <code>https://discord.com/api/webhooks...</code>)) only supports a maximum <strong>2,000 characters</strong> of notification text, including the title.</li>
<li><code><a target=_new href="https://github.com/caronc/apprise/wiki/Notify_telegram">tgram://</a></code> bots can't send messages to other bots, so you should specify chat ID of non-bot user.</li>
<li><code><a target=_new href="https://github.com/caronc/apprise/wiki/Notify_telegram">tgram://</a></code> only supports very limited HTML and can fail when extra tags are sent, <a href="https://core.telegram.org/bots/api#html-style">read more here</a> (or use plaintext/markdown format)</li>
@@ -40,7 +43,7 @@
</div>
<div class="pure-controls">
<div id="notification-token-toggle" class="pure-button button-tag button-xsmall">Show token/placeholders</div>
<div data-target="#notification-tokens-info" class="toggle-show pure-button button-tag button-xsmall">Show token/placeholders</div>
</div>
<div class="pure-controls" style="display: none;" id="notification-tokens-info">
<table class="pure-table" id="token-table">
@@ -107,7 +110,15 @@
<tr>
<td><code>{{ '{{triggered_text}}' }}</code></td>
<td>Text that tripped the trigger from filters</td>
</tr>
{% if extra_notification_token_placeholder_info %}
{% for token in extra_notification_token_placeholder_info %}
<tr>
<td><code>{{ '{{' }}{{ token[0] }}{{ '}}' }}</code></td>
<td>{{ token[1] }}</td>
</tr>
{% endfor %}
{% endif %}
</tbody>
</table>
<div class="pure-form-message-inline">

View File

@@ -33,7 +33,7 @@
<script src="{{url_for('static_content', group='js', filename='csrf.js')}}" defer></script>
</head>
<body>
<body class="">
<div class="header">
<div class="home-menu pure-menu pure-menu-horizontal pure-menu-fixed" id="nav-menu">
{% if has_password and not current_user.is_authenticated %}

View File

@@ -4,6 +4,7 @@
{% from '_common_fields.html' import render_common_settings_form %}
<script src="{{url_for('static_content', group='js', filename='tabs.js')}}" defer></script>
<script src="{{url_for('static_content', group='js', filename='vis.js')}}" defer></script>
<script src="{{url_for('static_content', group='js', filename='global-settings.js')}}" defer></script>
<script>
const browser_steps_available_screenshots=JSON.parse('{{ watch.get_browsersteps_available_screenshots|tojson }}');
const browser_steps_config=JSON.parse('{{ browser_steps_config|tojson }}');
@@ -49,7 +50,7 @@
{% endif %}
{% if watch['processor'] == 'text_json_diff' %}
<li class="tab"><a id="visualselector-tab" href="#visualselector">Visual Filter Selector</a></li>
<li class="tab"><a href="#filters-and-triggers">Filters &amp; Triggers</a></li>
<li class="tab" id="filters-and-triggers-tab"><a href="#filters-and-triggers">Filters &amp; Triggers</a></li>
{% endif %}
<li class="tab"><a href="#notifications">Notifications</a></li>
<li class="tab"><a href="#stats">Stats</a></li>
@@ -199,7 +200,7 @@ User-Agent: wonderbra 1.0") }}
<div id="loading-status-text" style="display: none;">Please wait, first browser step can take a little time to load..<div class="spinner"></div></div>
<div class="flex-wrapper" >
<div id="browser-steps-ui" class="noselect" style="width: 100%; background-color: #eee; border-radius: 5px;">
<div id="browser-steps-ui" class="noselect">
<div class="noselect" id="browsersteps-selector-wrapper" style="width: 100%">
<span class="loader" >
@@ -214,7 +215,7 @@ User-Agent: wonderbra 1.0") }}
<canvas class="noselect" id="browsersteps-selector-canvas" style="max-width: 100%; width: 100%;"></canvas>
</div>
</div>
<div id="browser-steps-fieldlist" style="padding-left: 1em; width: 350px; font-size: 80%;" >
<div id="browser-steps-fieldlist" >
<span id="browser-seconds-remaining">Loading</span> <span style="font-size: 80%;"> (<a target=_new href="https://github.com/dgtlmoon/changedetection.io/pull/478/files#diff-1a79d924d1840c485238e66772391268a89c95b781d69091384cf1ea1ac146c9R4">?</a>) </span>
{{ render_field(form.browser_steps) }}
</div>
@@ -246,14 +247,17 @@ User-Agent: wonderbra 1.0") }}
{% endif %}
<a href="#notifications" id="notification-setting-reset-to-default" class="pure-button button-xsmall" style="right: 20px; top: 20px; position: absolute; background-color: #5f42dd; border-radius: 4px; font-size: 70%; color: #fff">Use system defaults</a>
{{ render_common_settings_form(form, emailprefix, settings_application) }}
{{ render_common_settings_form(form, emailprefix, settings_application, extra_notification_token_placeholder_info) }}
</div>
</fieldset>
</div>
{% if watch['processor'] == 'text_json_diff' %}
<div class="tab-pane-inner" id="filters-and-triggers">
<div class="pure-control-group">
<span id="activate-text-preview" class="pure-button pure-button-primary button-xsmall">Activate preview</span>
<div>
<div id="edit-text-filter">
<div class="pure-control-group" id="pro-tips">
<strong>Pro-tips:</strong><br>
<ul>
<li>
@@ -275,9 +279,9 @@ xpath://body/div/span[contains(@class, 'example-class')]",
{% if '/text()' in field %}
<span class="pure-form-message-inline"><strong>Note!: //text() function does not work where the &lt;element&gt; contains &lt;![CDATA[]]&gt;</strong></span><br>
{% endif %}
<span class="pure-form-message-inline">One rule per line, <i>any</i> rules that matches will be used.<br>
<ul>
<span class="pure-form-message-inline">One CSS, xPath, JSON Path/JQ selector per line, <i>any</i> rules that matches will be used.<br>
<p><div data-target="#advanced-help-selectors" class="toggle-show pure-button button-tag button-xsmall">Show advanced help and tips</div><br></p>
<ul id="advanced-help-selectors" style="display: none;">
<li>CSS - Limit text to this CSS rule, only text matching this CSS rule is included.</li>
<li>JSON - Limit text to this JSON rule, using either <a href="https://pypi.org/project/jsonpath-ng/" target="new">JSONPath</a> or <a href="https://stedolan.github.io/jq/" target="new">jq</a> (if installed).
<ul>
@@ -297,21 +301,25 @@ xpath://body/div/span[contains(@class, 'example-class')]",
<li>To use XPath1.0: Prefix with <code>xpath1:</code></li>
</ul>
</li>
</ul>
Please be sure that you thoroughly understand how to write CSS, JSONPath, XPath{% if jq_support %}, or jq selector{%endif%} rules before filing an issue on GitHub! <a
<li>
Please be sure that you thoroughly understand how to write CSS, JSONPath, XPath{% if jq_support %}, or jq selector{%endif%} rules before filing an issue on GitHub! <a
href="https://github.com/dgtlmoon/changedetection.io/wiki/CSS-Selector-help">here for more CSS selector help</a>.<br>
</li>
</ul>
</span>
</div>
<fieldset class="pure-control-group">
{{ render_field(form.subtractive_selectors, rows=5, placeholder=has_tag_filters_extra+"header
footer
nav
.stockticker") }}
.stockticker
//*[contains(text(), 'Advertisement')]") }}
<span class="pure-form-message-inline">
<ul>
<li> Remove HTML element(s) by CSS selector before text conversion. </li>
<li> Don't paste HTML here, use only CSS selectors </li>
<li> Add multiple elements or CSS selectors per line to ignore multiple parts of the HTML. </li>
<li> Remove HTML element(s) by CSS and XPath selectors before text conversion. </li>
<li> Don't paste HTML here, use only CSS and XPath selectors </li>
<li> Add multiple elements, CSS or XPath selectors per line to ignore multiple parts of the HTML. </li>
</ul>
</span>
</fieldset>
@@ -326,14 +334,21 @@ nav
<span class="pure-form-message-inline">So it's always better to select <strong>Added</strong>+<strong>Replaced</strong> when you're interested in new content.</span><br>
<span class="pure-form-message-inline">When content is merely moved in a list, it will also trigger an <strong>addition</strong>, consider enabling <code><strong>Only trigger when unique lines appear</strong></code></span>
</fieldset>
<fieldset class="pure-control-group">
{{ render_checkbox_field(form.check_unique_lines) }}
<span class="pure-form-message-inline">Good for websites that just move the content around, and you want to know when NEW content is added, compares new lines against all history for this watch.</span>
</fieldset>
<fieldset class="pure-control-group">
{{ render_checkbox_field(form.remove_duplicate_lines) }}
<span class="pure-form-message-inline">Remove duplicate lines of text</span>
</fieldset>
<fieldset class="pure-control-group">
{{ render_checkbox_field(form.sort_text_alphabetically) }}
<span class="pure-form-message-inline">Helps reduce changes detected caused by sites shuffling lines around, combine with <i>check unique lines</i> below.</span>
</fieldset>
<fieldset class="pure-control-group">
{{ render_checkbox_field(form.check_unique_lines) }}
<span class="pure-form-message-inline">Good for websites that just move the content around, and you want to know when NEW content is added, compares new lines against all history for this watch.</span>
{{ render_checkbox_field(form.trim_text_whitespace) }}
<span class="pure-form-message-inline">Remove any whitespace before and after each line of text</span>
</fieldset>
<fieldset>
<div class="pure-control-group">
@@ -403,7 +418,19 @@ Unavailable") }}
</fieldset>
</div>
</div>
{% endif %}
<div id="text-preview" style="display: none;" >
<script>
const preview_text_edit_filters_url="{{url_for('watch_get_preview_rendered', uuid=uuid)}}";
</script>
<span><strong>Preview of the text that is used for changedetection after all filters run.</strong></span><br>
{#<div id="text-preview-controls"><span id="text-preview-refresh" class="pure-button button-xsmall">Refresh</span></div>#}
<p>
<div id="text-preview-inner"></div>
</p>
</div>
</div>
</div>
{% endif %}
{# rendered sub Template #}
{% if extra_form_content %}
<div class="tab-pane-inner" id="extras_tab">
@@ -479,6 +506,12 @@ Unavailable") }}
</tr>
</tbody>
</table>
{% if watch.history_n %}
<p>
<a href="{{url_for('watch_get_latest_html', uuid=uuid)}}" class="pure-button button-small">Download latest HTML snapshot</a>
</p>
{% endif %}
</div>
</div>
<div id="actions">

View File

@@ -76,7 +76,7 @@
</div>
<div class="pure-control-group">
{{ render_checkbox_field(form.application.form.empty_pages_are_a_change) }}
<span class="pure-form-message-inline">When a page contains HTML, but no renderable text appears (empty page), is this considered a change?</span>
<span class="pure-form-message-inline">When a request returns no content, or the HTML does not contain any text, is this considered a change?</span>
</div>
{% if form.requests.proxy %}
<div class="pure-control-group inline-radio">
@@ -92,7 +92,7 @@
<div class="tab-pane-inner" id="notifications">
<fieldset>
<div class="field-group">
{{ render_common_settings_form(form.application.form, emailprefix, settings_application) }}
{{ render_common_settings_form(form.application.form, emailprefix, settings_application, extra_notification_token_placeholder_info) }}
</div>
</fieldset>
<div class="pure-control-group" id="notification-base-url">
@@ -155,11 +155,13 @@
{{ render_field(form.application.form.global_subtractive_selectors, rows=5, placeholder="header
footer
nav
.stockticker") }}
.stockticker
//*[contains(text(), 'Advertisement')]") }}
<span class="pure-form-message-inline">
<ul>
<li> Remove HTML element(s) by CSS selector before text conversion. </li>
<li> Add multiple elements or CSS selectors per line to ignore multiple parts of the HTML. </li>
<li> Remove HTML element(s) by CSS and XPath selectors before text conversion. </li>
<li> Don't paste HTML here, use only CSS and XPath selectors </li>
<li> Add multiple elements, CSS or XPath selectors per line to ignore multiple parts of the HTML. </li>
</ul>
</span>
</fieldset>

View File

@@ -1,4 +1,4 @@
#!/usr/bin/python3
#!/usr/bin/env python3
import resource
import time
from threading import Thread

View File

@@ -1,4 +1,4 @@
# !/usr/bin/python3
#!/usr/bin/env python3
import os
from flask import url_for

View File

@@ -1,3 +1,3 @@
#!/usr/bin/python3
#!/usr/bin/env python3
from .. import conftest

View File

@@ -1,4 +1,4 @@
#!/usr/bin/python3
#!/usr/bin/env python3
import time
from flask import url_for

View File

@@ -1,4 +1,4 @@
#!/usr/bin/python3
#!/usr/bin/env python3
from .. import conftest

View File

@@ -1,4 +1,4 @@
#!/usr/bin/python3
#!/usr/bin/env python3
import os
from flask import url_for

View File

@@ -1,4 +1,4 @@
#!/usr/bin/python3
#!/usr/bin/env python3
import time
from flask import url_for

View File

@@ -1,4 +1,4 @@
#!/usr/bin/python3
#!/usr/bin/env python3
import time
from flask import url_for

View File

@@ -1,4 +1,4 @@
#!/usr/bin/python3
#!/usr/bin/env python3
import time
from flask import url_for

View File

@@ -1,12 +1,27 @@
#!/usr/bin/python3
#!/usr/bin/env python3
import os
import time
from flask import url_for
from changedetectionio.tests.util import live_server_setup, wait_for_all_checks
def set_response():
import time
data = f"""<html>
<body>
<h1>Awesome, you made it</h1>
yeah the socks request worked
</body>
</html>
"""
with open("test-datastore/endpoint-content.txt", "w") as f:
f.write(data)
time.sleep(1)
def test_socks5(client, live_server, measure_memory_usage):
live_server_setup(live_server)
set_response()
# Setup a proxy
res = client.post(
@@ -24,7 +39,10 @@ def test_socks5(client, live_server, measure_memory_usage):
assert b"Settings updated." in res.data
test_url = "https://changedetection.io/CHANGELOG.txt?socks-test-tag=" + os.getenv('SOCKSTEST', '')
# Because the socks server should connect back to us
test_url = url_for('test_endpoint', _external=True) + f"?socks-test-tag={os.getenv('SOCKSTEST', '')}"
test_url = test_url.replace('localhost.localdomain', 'cdio')
test_url = test_url.replace('localhost', 'cdio')
res = client.post(
url_for("form_quick_watch_add"),
@@ -60,4 +78,4 @@ def test_socks5(client, live_server, measure_memory_usage):
)
# Should see the proper string
assert "+0200:".encode('utf-8') in res.data
assert "Awesome, you made it".encode('utf-8') in res.data

View File

@@ -1,16 +1,32 @@
#!/usr/bin/python3
#!/usr/bin/env python3
import os
import time
from flask import url_for
from changedetectionio.tests.util import live_server_setup, wait_for_all_checks
def set_response():
import time
data = f"""<html>
<body>
<h1>Awesome, you made it</h1>
yeah the socks request worked
</body>
</html>
"""
with open("test-datastore/endpoint-content.txt", "w") as f:
f.write(data)
time.sleep(1)
# should be proxies.json mounted from run_proxy_tests.sh already
# -v `pwd`/tests/proxy_socks5/proxies.json-example:/app/changedetectionio/test-datastore/proxies.json
def test_socks5_from_proxiesjson_file(client, live_server, measure_memory_usage):
live_server_setup(live_server)
test_url = "https://changedetection.io/CHANGELOG.txt?socks-test-tag=" + os.getenv('SOCKSTEST', '')
set_response()
# Because the socks server should connect back to us
test_url = url_for('test_endpoint', _external=True) + f"?socks-test-tag={os.getenv('SOCKSTEST', '')}"
test_url = test_url.replace('localhost.localdomain', 'cdio')
test_url = test_url.replace('localhost', 'cdio')
res = client.get(url_for("settings_page"))
assert b'name="requests-proxy" type="radio" value="socks5proxy"' in res.data
@@ -49,4 +65,4 @@ def test_socks5_from_proxiesjson_file(client, live_server, measure_memory_usage)
)
# Should see the proper string
assert "+0200:".encode('utf-8') in res.data
assert "Awesome, you made it".encode('utf-8') in res.data

View File

@@ -1,3 +1,3 @@
#!/usr/bin/python3
#!/usr/bin/env python3
from .. import conftest

View File

@@ -1,8 +1,8 @@
#!/usr/bin/python3
#!/usr/bin/env python3
import os
import time
from flask import url_for
from ..util import live_server_setup, wait_for_all_checks, extract_UUID_from_client
from ..util import live_server_setup, wait_for_all_checks, extract_UUID_from_client, wait_for_notification_endpoint_output
from changedetectionio.notification import (
default_notification_body,
default_notification_format,
@@ -94,7 +94,7 @@ def test_restock_detection(client, live_server, measure_memory_usage):
assert b'not-in-stock' not in res.data
# We should have a notification
time.sleep(2)
wait_for_notification_endpoint_output()
assert os.path.isfile("test-datastore/notification.txt"), "Notification received"
os.unlink("test-datastore/notification.txt")
@@ -103,6 +103,7 @@ def test_restock_detection(client, live_server, measure_memory_usage):
set_original_response()
client.get(url_for("form_watch_checknow"), follow_redirects=True)
wait_for_all_checks(client)
time.sleep(5)
assert not os.path.isfile("test-datastore/notification.txt"), "No notification should have fired when it went OUT OF STOCK by default"
# BUT we should see that it correctly shows "not in stock"

View File

@@ -1,4 +1,4 @@
#!/usr/bin/python3
#!/usr/bin/env python3
import asyncio
from aiosmtpd.controller import Controller
from aiosmtpd.smtp import SMTP

View File

@@ -1,8 +1,8 @@
#!/usr/bin/python3
#!/usr/bin/env python3
import os.path
import time
from flask import url_for
from .util import live_server_setup, wait_for_all_checks
from .util import live_server_setup, wait_for_all_checks, wait_for_notification_endpoint_output
from changedetectionio import html_tools
@@ -112,7 +112,7 @@ def test_check_add_line_contains_trigger(client, live_server, measure_memory_usa
res = client.post(
url_for("settings_page"),
data={"application-notification_title": "New ChangeDetection.io Notification - {{ watch_url }}",
"application-notification_body": 'triggered text was -{{triggered_text}}-',
"application-notification_body": 'triggered text was -{{triggered_text}}- 网站监测 内容更新了',
# https://github.com/caronc/apprise/wiki/Notify_Custom_JSON#get-parameter-manipulation
"application-notification_urls": test_notification_url,
"application-minutes_between_check": 180,
@@ -165,11 +165,12 @@ def test_check_add_line_contains_trigger(client, live_server, measure_memory_usa
assert b'unviewed' in res.data
# Takes a moment for apprise to fire
time.sleep(3)
wait_for_notification_endpoint_output()
assert os.path.isfile("test-datastore/notification.txt"), "Notification fired because I can see the output file"
with open("test-datastore/notification.txt", 'r') as f:
response= f.read()
assert '-Oh yes please-' in response
with open("test-datastore/notification.txt", 'rb') as f:
response = f.read()
assert b'-Oh yes please-' in response
assert '网站监测 内容更新了'.encode('utf-8') in response
res = client.get(url_for("form_delete", uuid="all"), follow_redirects=True)

View File

@@ -1,4 +1,4 @@
#!/usr/bin/python3
#!/usr/bin/env python3
import time
from flask import url_for

View File

@@ -1,4 +1,4 @@
#!/usr/bin/python3
#!/usr/bin/env python3
import time
from flask import url_for

View File

@@ -1,4 +1,4 @@
#!/usr/bin/python3
#!/usr/bin/env python3
import time
from flask import url_for

View File

@@ -1,4 +1,4 @@
#!/usr/bin/python3
#!/usr/bin/env python3
import time
from flask import url_for
@@ -69,6 +69,12 @@ def test_check_basic_change_detection_functionality(client, live_server, measure
wait_for_all_checks(client)
uuid = extract_UUID_from_client(client)
# Check the 'get latest snapshot works'
res = client.get(url_for("watch_get_latest_html", uuid=uuid))
assert b'which has this one new line' in res.data
# Now something should be ready, indicated by having a 'unviewed' class
res = client.get(url_for("index"))
assert b'unviewed' in res.data
@@ -86,7 +92,7 @@ def test_check_basic_change_detection_functionality(client, live_server, measure
assert expected_url.encode('utf-8') in res.data
# Following the 'diff' link, it should no longer display as 'unviewed' even after we recheck it a few times
res = client.get(url_for("diff_history_page", uuid="first"))
res = client.get(url_for("diff_history_page", uuid=uuid))
assert b'selected=""' in res.data, "Confirm diff history page loaded"
# Check the [preview] pulls the right one
@@ -143,7 +149,6 @@ def test_check_basic_change_detection_functionality(client, live_server, measure
assert b'unviewed' not in res.data
# #2458 "clear history" should make the Watch object update its status correctly when the first snapshot lands again
uuid = extract_UUID_from_client(client)
client.get(url_for("clear_watch_history", uuid=uuid))
client.get(url_for("form_watch_checknow"), follow_redirects=True)
wait_for_all_checks(client)

View File

@@ -1,4 +1,4 @@
#!/usr/bin/python3
#!/usr/bin/env python3
from .util import set_original_response, live_server_setup, wait_for_all_checks
from flask import url_for

View File

@@ -1,4 +1,4 @@
#!/usr/bin/python3
#!/usr/bin/env python3
import time
from flask import url_for

View File

@@ -1,4 +1,4 @@
#!/usr/bin/python3
#!/usr/bin/env python3
import time
from flask import url_for

View File

@@ -1,4 +1,4 @@
#!/usr/bin/python3
#!/usr/bin/env python3
import time
from flask import url_for

View File

@@ -1,4 +1,4 @@
#!/usr/bin/python3
#!/usr/bin/env python3
import time
@@ -87,6 +87,9 @@ def test_element_removal_output():
Some initial text<br>
<p>across multiple lines</p>
<div id="changetext">Some text that changes</div>
<div>Some text should be matched by xPath // selector</div>
<div>Some text should be matched by xPath selector</div>
<div>Some text should be matched by xPath1 selector</div>
</body>
<footer>
<p>Footer</p>
@@ -94,7 +97,16 @@ def test_element_removal_output():
</html>
"""
html_blob = element_removal(
["header", "footer", "nav", "#changetext"], html_content=content
[
"header",
"footer",
"nav",
"#changetext",
"//*[contains(text(), 'xPath // selector')]",
"xpath://*[contains(text(), 'xPath selector')]",
"xpath1://*[contains(text(), 'xPath1 selector')]"
],
html_content=content
)
text = get_text(html_blob)
assert (

View File

@@ -1,4 +1,4 @@
#!/usr/bin/python3
#!/usr/bin/env python3
# coding=utf-8
import time

View File

@@ -1,4 +1,4 @@
#!/usr/bin/python3
#!/usr/bin/env python3
import time

View File

@@ -1,4 +1,4 @@
#!/usr/bin/python3
#!/usr/bin/env python3
import time
from flask import url_for

View File

@@ -1,4 +1,4 @@
#!/usr/bin/python3
#!/usr/bin/env python3
import time
from flask import url_for

View File

@@ -1,10 +1,10 @@
#!/usr/bin/python3
#!/usr/bin/env python3
# https://www.reddit.com/r/selfhosted/comments/wa89kp/comment/ii3a4g7/?context=3
import os
import time
from flask import url_for
from .util import set_original_response, live_server_setup
from .util import set_original_response, live_server_setup, wait_for_notification_endpoint_output
from changedetectionio.model import App
@@ -102,14 +102,15 @@ def test_filter_doesnt_exist_then_exists_should_get_notification(client, live_se
follow_redirects=True
)
assert b"Updated watch." in res.data
time.sleep(3)
wait_for_notification_endpoint_output()
# Shouldn't exist, shouldn't have fired
assert not os.path.isfile("test-datastore/notification.txt")
# Now the filter should exist
set_response_with_filter()
client.get(url_for("form_watch_checknow"), follow_redirects=True)
time.sleep(3)
wait_for_notification_endpoint_output()
assert os.path.isfile("test-datastore/notification.txt")

View File

@@ -1,7 +1,9 @@
import os
import time
from loguru import logger
from flask import url_for
from .util import set_original_response, live_server_setup, extract_UUID_from_client, wait_for_all_checks
from .util import set_original_response, live_server_setup, extract_UUID_from_client, wait_for_all_checks, \
wait_for_notification_endpoint_output
from changedetectionio.model import App
@@ -26,6 +28,12 @@ def run_filter_test(client, live_server, content_filter):
# Response WITHOUT the filter ID element
set_original_response()
# Goto the edit page, add our ignore text
notification_url = url_for('test_notification_endpoint', _external=True).replace('http', 'json')
# Add our URL to the import page
test_url = url_for('test_endpoint', _external=True)
# cleanup for the next
client.get(
url_for("form_delete", uuid="all"),
@@ -34,83 +42,92 @@ def run_filter_test(client, live_server, content_filter):
if os.path.isfile("test-datastore/notification.txt"):
os.unlink("test-datastore/notification.txt")
# Add our URL to the import page
test_url = url_for('test_endpoint', _external=True)
res = client.post(
url_for("form_quick_watch_add"),
data={"url": test_url, "tags": ''},
url_for("import_page"),
data={"urls": test_url},
follow_redirects=True
)
assert b"Watch added" in res.data
# Give the thread time to pick up the first version
assert b"1 Imported" in res.data
wait_for_all_checks(client)
# Goto the edit page, add our ignore text
# Add our URL to the import page
url = url_for('test_notification_endpoint', _external=True)
notification_url = url.replace('http', 'json')
uuid = extract_UUID_from_client(client)
print(">>>> Notification URL: " + notification_url)
assert live_server.app.config['DATASTORE'].data['watching'][uuid]['consecutive_filter_failures'] == 0, "No filter = No filter failure"
# Just a regular notification setting, this will be used by the special 'filter not found' notification
notification_form_data = {"notification_urls": notification_url,
"notification_title": "New ChangeDetection.io Notification - {{watch_url}}",
"notification_body": "BASE URL: {{base_url}}\n"
"Watch URL: {{watch_url}}\n"
"Watch UUID: {{watch_uuid}}\n"
"Watch title: {{watch_title}}\n"
"Watch tag: {{watch_tag}}\n"
"Preview: {{preview_url}}\n"
"Diff URL: {{diff_url}}\n"
"Snapshot: {{current_snapshot}}\n"
"Diff: {{diff}}\n"
"Diff Full: {{diff_full}}\n"
"Diff as Patch: {{diff_patch}}\n"
":-)",
"notification_format": "Text"}
watch_data = {"notification_urls": notification_url,
"notification_title": "New ChangeDetection.io Notification - {{watch_url}}",
"notification_body": "BASE URL: {{base_url}}\n"
"Watch URL: {{watch_url}}\n"
"Watch UUID: {{watch_uuid}}\n"
"Watch title: {{watch_title}}\n"
"Watch tag: {{watch_tag}}\n"
"Preview: {{preview_url}}\n"
"Diff URL: {{diff_url}}\n"
"Snapshot: {{current_snapshot}}\n"
"Diff: {{diff}}\n"
"Diff Full: {{diff_full}}\n"
"Diff as Patch: {{diff_patch}}\n"
":-)",
"notification_format": "Text",
"fetch_backend": "html_requests",
"filter_failure_notification_send": 'y',
"headers": "",
"tags": "my tag",
"title": "my title 123",
"time_between_check-hours": 5, # So that the queue runner doesnt also put it in
"url": test_url,
}
notification_form_data.update({
"url": test_url,
"tags": "my tag",
"title": "my title 123",
"headers": "",
"filter_failure_notification_send": 'y',
"include_filters": content_filter,
"fetch_backend": "html_requests"})
# A POST here will also reset the filter failure counter (filter_failure_notification_threshold_attempts)
res = client.post(
url_for("edit_page", uuid="first"),
data=notification_form_data,
url_for("edit_page", uuid=uuid),
data=watch_data,
follow_redirects=True
)
assert b"Updated watch." in res.data
wait_for_all_checks(client)
assert live_server.app.config['DATASTORE'].data['watching'][uuid]['consecutive_filter_failures'] == 0, "No filter = No filter failure"
# Now the notification should not exist, because we didnt reach the threshold
# Now add a filter, because recheck hours == 5, ONLY pressing of the [edit] or [recheck all] should trigger
watch_data['include_filters'] = content_filter
res = client.post(
url_for("edit_page", uuid=uuid),
data=watch_data,
follow_redirects=True
)
assert b"Updated watch." in res.data
# It should have checked once so far and given this error (because we hit SAVE)
wait_for_all_checks(client)
assert not os.path.isfile("test-datastore/notification.txt")
# Hitting [save] would have triggered a recheck, and we have a filter, so this would be ONE failure
assert live_server.app.config['DATASTORE'].data['watching'][uuid]['consecutive_filter_failures'] == 1, "Should have been checked once"
# recheck it up to just before the threshold, including the fact that in the previous POST it would have rechecked (and incremented)
for i in range(0, App._FILTER_FAILURE_THRESHOLD_ATTEMPTS_DEFAULT-2):
# Add 4 more checks
checked = 0
ATTEMPT_THRESHOLD_SETTING = live_server.app.config['DATASTORE'].data['settings']['application'].get('filter_failure_notification_threshold_attempts', 0)
for i in range(0, ATTEMPT_THRESHOLD_SETTING - 2):
checked += 1
client.get(url_for("form_watch_checknow"), follow_redirects=True)
wait_for_all_checks(client)
time.sleep(2) # delay for apprise to fire
assert not os.path.isfile("test-datastore/notification.txt"), f"test-datastore/notification.txt should not exist - Attempt {i} when threshold is {App._FILTER_FAILURE_THRESHOLD_ATTEMPTS_DEFAULT}"
# We should see something in the frontend
res = client.get(url_for("index"))
assert b'Warning, no filters were found' in res.data
res = client.get(url_for("index"))
assert b'Warning, no filters were found' in res.data
assert not os.path.isfile("test-datastore/notification.txt")
time.sleep(1)
assert live_server.app.config['DATASTORE'].data['watching'][uuid]['consecutive_filter_failures'] == 5
time.sleep(2)
# One more check should trigger the _FILTER_FAILURE_THRESHOLD_ATTEMPTS_DEFAULT threshold
client.get(url_for("form_watch_checknow"), follow_redirects=True)
wait_for_all_checks(client)
time.sleep(2) # delay for apprise to fire
wait_for_notification_endpoint_output()
# Now it should exist and contain our "filter not found" alert
assert os.path.isfile("test-datastore/notification.txt")
with open("test-datastore/notification.txt", 'r') as f:
notification = f.read()
@@ -123,10 +140,11 @@ def run_filter_test(client, live_server, content_filter):
set_response_with_filter()
# Try several times, it should NOT have 'filter not found'
for i in range(0, App._FILTER_FAILURE_THRESHOLD_ATTEMPTS_DEFAULT):
for i in range(0, ATTEMPT_THRESHOLD_SETTING + 2):
client.get(url_for("form_watch_checknow"), follow_redirects=True)
wait_for_all_checks(client)
wait_for_notification_endpoint_output()
# It should have sent a notification, but..
assert os.path.isfile("test-datastore/notification.txt")
# but it should not contain the info about a failed filter (because there was none in this case)
@@ -135,9 +153,6 @@ def run_filter_test(client, live_server, content_filter):
assert not 'CSS/xPath filter was not present in the page' in notification
# Re #1247 - All tokens got replaced correctly in the notification
res = client.get(url_for("index"))
uuid = extract_UUID_from_client(client)
# UUID is correct, but notification contains tag uuid as UUIID wtf
assert uuid in notification
# cleanup for the next
@@ -152,9 +167,11 @@ def test_setup(live_server):
live_server_setup(live_server)
def test_check_include_filters_failure_notification(client, live_server, measure_memory_usage):
# live_server_setup(live_server)
run_filter_test(client, live_server,'#nope-doesnt-exist')
def test_check_xpath_filter_failure_notification(client, live_server, measure_memory_usage):
# live_server_setup(live_server)
run_filter_test(client, live_server, '//*[@id="nope-doesnt-exist"]')
# Test that notification is never sent

View File

@@ -1,4 +1,4 @@
#!/usr/bin/python3
#!/usr/bin/env python3
import time
from flask import url_for

View File

@@ -1,4 +1,4 @@
#!/usr/bin/python3
#!/usr/bin/env python3
import time
import os

View File

@@ -1,4 +1,4 @@
#!/usr/bin/python3
#!/usr/bin/env python3
"""Test suite for the method to extract text from an html string"""
from ..html_tools import html_to_text

View File

@@ -1,4 +1,4 @@
#!/usr/bin/python3
#!/usr/bin/env python3
from . util import live_server_setup
from changedetectionio import html_tools

View File

@@ -1,4 +1,4 @@
#!/usr/bin/python3
#!/usr/bin/env python3
import time
from flask import url_for

View File

@@ -1,4 +1,4 @@
#!/usr/bin/python3
#!/usr/bin/env python3
import time
from flask import url_for

View File

@@ -1,4 +1,4 @@
#!/usr/bin/python3
#!/usr/bin/env python3
"""Test suite for the render/not render anchor tag content functionality"""
import time

View File

@@ -1,4 +1,4 @@
#!/usr/bin/python3
#!/usr/bin/env python3
import time
from flask import url_for

View File

@@ -1,4 +1,4 @@
#!/usr/bin/python3
#!/usr/bin/env python3
import time
from flask import url_for

View File

@@ -1,4 +1,4 @@
#!/usr/bin/python3
#!/usr/bin/env python3
import io
import os
import time

View File

@@ -1,4 +1,4 @@
#!/usr/bin/python3
#!/usr/bin/env python3
import time
from flask import url_for

View File

@@ -1,4 +1,4 @@
#!/usr/bin/python3
#!/usr/bin/env python3
# coding=utf-8
import time

View File

@@ -1,11 +1,8 @@
#!/usr/bin/python3
#!/usr/bin/env python3
import time
from flask import url_for
from urllib.request import urlopen
from .util import set_original_response, set_modified_response, live_server_setup
sleep_time_for_fetch_thread = 3
from .util import set_original_response, set_modified_response, live_server_setup, wait_for_all_checks
import time
def set_nonrenderable_response():
@@ -16,12 +13,18 @@ def set_nonrenderable_response():
</body>
</html>
"""
with open("test-datastore/endpoint-content.txt", "w") as f:
f.write(test_return_data)
time.sleep(1)
return None
def set_zero_byte_response():
with open("test-datastore/endpoint-content.txt", "w") as f:
f.write("")
time.sleep(1)
return None
def test_check_basic_change_detection_functionality(client, live_server, measure_memory_usage):
set_original_response()
live_server_setup(live_server)
@@ -35,18 +38,11 @@ def test_check_basic_change_detection_functionality(client, live_server, measure
assert b"1 Imported" in res.data
time.sleep(sleep_time_for_fetch_thread)
wait_for_all_checks(client)
# Do this a few times.. ensures we dont accidently set the status
for n in range(3):
client.get(url_for("form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up
time.sleep(sleep_time_for_fetch_thread)
# It should report nothing found (no new 'unviewed' class)
res = client.get(url_for("index"))
assert b'unviewed' not in res.data
# It should report nothing found (no new 'unviewed' class)
res = client.get(url_for("index"))
assert b'unviewed' not in res.data
#####################
@@ -64,7 +60,7 @@ def test_check_basic_change_detection_functionality(client, live_server, measure
client.get(url_for("form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up
time.sleep(sleep_time_for_fetch_thread)
wait_for_all_checks(client)
# It should report nothing found (no new 'unviewed' class)
res = client.get(url_for("index"))
@@ -86,14 +82,20 @@ def test_check_basic_change_detection_functionality(client, live_server, measure
client.get(url_for("form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up
time.sleep(sleep_time_for_fetch_thread)
wait_for_all_checks(client)
# It should report nothing found (no new 'unviewed' class)
res = client.get(url_for("index"))
assert b'unviewed' in res.data
client.get(url_for("mark_all_viewed"), follow_redirects=True)
# A totally zero byte (#2528) response should also not trigger an error
set_zero_byte_response()
client.get(url_for("form_watch_checknow"), follow_redirects=True)
wait_for_all_checks(client)
res = client.get(url_for("index"))
assert b'unviewed' in res.data # A change should have registered because empty_pages_are_a_change is ON
assert b'fetch-error' not in res.data
#
# Cleanup everything

View File

@@ -3,6 +3,8 @@ import os
import time
import re
from flask import url_for
from loguru import logger
from .util import set_original_response, set_modified_response, set_more_modified_response, live_server_setup, wait_for_all_checks, \
set_longer_modified_response
from . util import extract_UUID_from_client
@@ -289,11 +291,11 @@ def test_notification_custom_endpoint_and_jinja2(client, live_server, measure_me
data={
"application-fetch_backend": "html_requests",
"application-minutes_between_check": 180,
"application-notification_body": '{ "url" : "{{ watch_url }}", "secret": 444 }',
"application-notification_body": '{ "url" : "{{ watch_url }}", "secret": 444, "somebug": "网站监测 内容更新了" }',
"application-notification_format": default_notification_format,
"application-notification_urls": test_notification_url,
# https://github.com/caronc/apprise/wiki/Notify_Custom_JSON#get-parameter-manipulation
"application-notification_title": "New ChangeDetection.io Notification - {{ watch_url }}",
"application-notification_title": "New ChangeDetection.io Notification - {{ watch_url }} ",
},
follow_redirects=True
)
@@ -322,6 +324,7 @@ def test_notification_custom_endpoint_and_jinja2(client, live_server, measure_me
j = json.loads(x)
assert j['url'].startswith('http://localhost')
assert j['secret'] == 444
assert j['somebug'] == '网站监测 内容更新了'
# URL check, this will always be converted to lowercase
assert os.path.isfile("test-datastore/notification-url.txt")
@@ -347,3 +350,82 @@ def test_notification_custom_endpoint_and_jinja2(client, live_server, measure_me
url_for("form_delete", uuid="all"),
follow_redirects=True
)
#2510
def test_global_send_test_notification(client, live_server, measure_memory_usage):
#live_server_setup(live_server)
set_original_response()
if os.path.isfile("test-datastore/notification.txt"):
os.unlink("test-datastore/notification.txt")
# otherwise other settings would have already existed from previous tests in this file
res = client.post(
url_for("settings_page"),
data={
"application-fetch_backend": "html_requests",
"application-minutes_between_check": 180,
#1995 UTF-8 content should be encoded
"application-notification_body": 'change detection is cool 网站监测 内容更新了',
"application-notification_format": default_notification_format,
"application-notification_urls": "",
"application-notification_title": "New ChangeDetection.io Notification - {{ watch_url }}",
},
follow_redirects=True
)
assert b'Settings updated' in res.data
test_url = url_for('test_endpoint', _external=True)
res = client.post(
url_for("form_quick_watch_add"),
data={"url": test_url, "tags": 'nice one'},
follow_redirects=True
)
assert b"Watch added" in res.data
test_notification_url = url_for('test_notification_endpoint', _external=True).replace('http://', 'post://')+"?xxx={{ watch_url }}&+custom-header=123"
######### Test global/system settings
res = client.post(
url_for("ajax_callback_send_notification_test")+"?mode=global-settings",
data={"notification_urls": test_notification_url},
follow_redirects=True
)
assert res.status_code != 400
assert res.status_code != 500
# Give apprise time to fire
time.sleep(4)
with open("test-datastore/notification.txt", 'r') as f:
x = f.read()
assert 'change detection is cool 网站监测 内容更新了' in x
os.unlink("test-datastore/notification.txt")
######### Test group/tag settings
res = client.post(
url_for("ajax_callback_send_notification_test")+"?mode=group-settings",
data={"notification_urls": test_notification_url},
follow_redirects=True
)
assert res.status_code != 400
assert res.status_code != 500
# Give apprise time to fire
time.sleep(4)
with open("test-datastore/notification.txt", 'r') as f:
x = f.read()
# Should come from notification.py default handler when there is no notification body to pull from
assert 'change detection is cool 网站监测 内容更新了' in x
client.get(
url_for("form_delete", uuid="all"),
follow_redirects=True
)

View File

@@ -1,4 +1,4 @@
#!/usr/bin/python3
#!/usr/bin/env python3
import time
from flask import url_for

View File

@@ -1,4 +1,4 @@
#!/usr/bin/python3
#!/usr/bin/env python3
import time
from flask import url_for

View File

@@ -1,8 +1,10 @@
#!/usr/bin/python3
#!/usr/bin/env python3
import os
import time
from flask import url_for
from .util import live_server_setup, wait_for_all_checks, extract_UUID_from_client
from .util import live_server_setup, wait_for_all_checks, extract_UUID_from_client, wait_for_notification_endpoint_output
from ..notification import default_notification_format
instock_props = [
# LD+JSON with non-standard list of 'type' https://github.com/dgtlmoon/changedetection.io/issues/1833
@@ -144,14 +146,13 @@ def _run_test_minmax_limit(client, extra_watch_edit_form):
data={"url": test_url, "tags": 'restock tests', 'processor': 'restock_diff'},
follow_redirects=True
)
# A change in price, should trigger a change by default
wait_for_all_checks(client)
data = {
"tags": "",
"url": test_url,
"headers": "",
"time_between_check-hours": 5,
'fetch_backend': "html_requests"
}
data.update(extra_watch_edit_form)
@@ -176,11 +177,9 @@ def _run_test_minmax_limit(client, extra_watch_edit_form):
assert b'1,000.45' or b'1000.45' in res.data #depending on locale
assert b'unviewed' not in res.data
# price changed to something LESS than min (900), SHOULD be a change
set_original_response(props_markup=instock_props[0], price='890.45')
# let previous runs wait
time.sleep(1)
res = client.get(url_for("form_watch_checknow"), follow_redirects=True)
assert b'1 watches queued for rechecking.' in res.data
wait_for_all_checks(client)
@@ -195,7 +194,8 @@ def _run_test_minmax_limit(client, extra_watch_edit_form):
client.get(url_for("form_watch_checknow"), follow_redirects=True)
wait_for_all_checks(client)
res = client.get(url_for("index"))
assert b'1,890.45' or b'1890.45' in res.data
# Depending on the LOCALE it may be either of these (generally for US/default/etc)
assert b'1,890.45' in res.data or b'1890.45' in res.data
assert b'unviewed' in res.data
res = client.get(url_for("form_delete", uuid="all"), follow_redirects=True)
@@ -305,6 +305,70 @@ def test_itemprop_percent_threshold(client, live_server):
res = client.get(url_for("form_delete", uuid="all"), follow_redirects=True)
assert b'Deleted' in res.data
def test_change_with_notification_values(client, live_server):
#live_server_setup(live_server)
if os.path.isfile("test-datastore/notification.txt"):
os.unlink("test-datastore/notification.txt")
test_url = url_for('test_endpoint', _external=True)
set_original_response(props_markup=instock_props[0], price='960.45')
notification_url = url_for('test_notification_endpoint', _external=True).replace('http', 'json')
######################
# You must add a type of 'restock_diff' for its tokens to register as valid in the global settings
client.post(
url_for("form_quick_watch_add"),
data={"url": test_url, "tags": 'restock tests', 'processor': 'restock_diff'},
follow_redirects=True
)
# A change in price, should trigger a change by default
wait_for_all_checks(client)
# Should see new tokens register
res = client.get(url_for("settings_page"))
assert b'{{restock.original_price}}' in res.data
assert b'Original price at first check' in res.data
#####################
# Set this up for when we remove the notification from the watch, it should fallback with these details
res = client.post(
url_for("settings_page"),
data={"application-notification_urls": notification_url,
"application-notification_title": "title new price {{restock.price}}",
"application-notification_body": "new price {{restock.price}}",
"application-notification_format": default_notification_format,
"requests-time_between_check-minutes": 180,
'application-fetch_backend': "html_requests"},
follow_redirects=True
)
# check tag accepts without error
# Check the watches in these modes add the tokens for validating
assert b"A variable or function is not defined" not in res.data
assert b"Settings updated." in res.data
set_original_response(props_markup=instock_props[0], price='960.45')
# A change in price, should trigger a change by default
set_original_response(props_markup=instock_props[0], price='1950.45')
client.get(url_for("form_watch_checknow"))
wait_for_all_checks(client)
wait_for_notification_endpoint_output()
assert os.path.isfile("test-datastore/notification.txt"), "Notification received"
with open("test-datastore/notification.txt", 'r') as f:
notification = f.read()
assert "new price 1950.45" in notification
assert "title new price 1950.45" in notification
def test_data_sanity(client, live_server):
#live_server_setup(live_server)

View File

@@ -1,4 +1,4 @@
#!/usr/bin/python3
#!/usr/bin/env python3
import time
from flask import url_for

View File

@@ -1,7 +1,12 @@
import os
from flask import url_for
from .util import set_original_response, set_modified_response, live_server_setup, wait_for_all_checks
import time
from .. import strtobool
def test_setup(client, live_server, measure_memory_usage):
live_server_setup(live_server)
@@ -55,17 +60,33 @@ def test_bad_access(client, live_server, measure_memory_usage):
assert b'Watch protocol is not permitted by SAFE_PROTOCOL_REGEX' in res.data
# file:// is permitted by default, but it will be caught by ALLOW_FILE_URI
def test_file_access(client, live_server, measure_memory_usage):
#live_server_setup(live_server)
test_file_path = "/tmp/test-file.txt"
# file:// is permitted by default, but it will be caught by ALLOW_FILE_URI
client.post(
url_for("form_quick_watch_add"),
data={"url": 'file:///tasty/disk/drive', "tags": ''},
data={"url": f"file://{test_file_path}", "tags": ''},
follow_redirects=True
)
wait_for_all_checks(client)
res = client.get(url_for("index"))
assert b'file:// type access is denied for security reasons.' in res.data
# If it is enabled at test time
if strtobool(os.getenv('ALLOW_FILE_URI', 'false')):
res = client.get(
url_for("preview_page", uuid="first"),
follow_redirects=True
)
# Should see something (this file added by run_basic_tests.sh)
assert b"Hello world" in res.data
else:
# Default should be here
assert b'file:// type access is denied for security reasons.' in res.data
def test_xss(client, live_server, measure_memory_usage):
#live_server_setup(live_server)

View File

@@ -1,4 +1,4 @@
#!/usr/bin/python3
#!/usr/bin/env python3
import time
from flask import url_for

View File

@@ -1,4 +1,4 @@
#!/usr/bin/python3
#!/usr/bin/env python3
import time
from flask import url_for

View File

@@ -1,4 +1,4 @@
#!/usr/bin/python3
#!/usr/bin/env python3
import time
from flask import url_for

View File

@@ -1,4 +1,4 @@
#!/usr/bin/python3
#!/usr/bin/env python3
import time
from flask import url_for

View File

@@ -1,4 +1,4 @@
#!/usr/bin/python3
#!/usr/bin/env python3
import time
from flask import url_for

View File

@@ -1,4 +1,4 @@
#!/usr/bin/python3
#!/usr/bin/env python3
import time
from flask import url_for
@@ -11,6 +11,8 @@ def set_original_ignore_response():
<p>Some initial text</p>
<p>Which is across multiple lines</p>
<p>So let's see what happens.</p>
<p>&nbsp; So let's see what happens. <br> </p>
<p>A - sortable line</p>
</body>
</html>
"""
@@ -164,5 +166,52 @@ def test_sort_lines_functionality(client, live_server, measure_memory_usage):
assert res.data.find(b'A uppercase') < res.data.find(b'Z last')
assert res.data.find(b'Some initial text') < res.data.find(b'Which is across multiple lines')
res = client.get(url_for("form_delete", uuid="all"), follow_redirects=True)
assert b'Deleted' in res.data
def test_extra_filters(client, live_server, measure_memory_usage):
#live_server_setup(live_server)
set_original_ignore_response()
# Add our URL to the import page
test_url = url_for('test_endpoint', _external=True)
res = client.post(
url_for("import_page"),
data={"urls": test_url},
follow_redirects=True
)
assert b"1 Imported" in res.data
wait_for_all_checks(client)
# Add our URL to the import page
res = client.post(
url_for("edit_page", uuid="first"),
data={"remove_duplicate_lines": "y",
"trim_text_whitespace": "y",
"sort_text_alphabetically": "", # leave this OFF for testing
"url": test_url,
"fetch_backend": "html_requests"},
follow_redirects=True
)
assert b"Updated watch." in res.data
# Give the thread time to pick it up
wait_for_all_checks(client)
# Trigger a check
client.get(url_for("form_watch_checknow"), follow_redirects=True)
# Give the thread time to pick it up
wait_for_all_checks(client)
res = client.get(
url_for("preview_page", uuid="first")
)
assert res.data.count(b"see what happens.") == 1
# still should remain unsorted ('A - sortable line') stays at the end
assert res.data.find(b'A - sortable line') > res.data.find(b'Which is across multiple lines')
res = client.get(url_for("form_delete", uuid="all"), follow_redirects=True)
assert b'Deleted' in res.data

View File

@@ -1,4 +1,4 @@
#!/usr/bin/python3
#!/usr/bin/env python3
# run from dir above changedetectionio/ dir
# python3 -m unittest changedetectionio.tests.unit.test_jinja2_security

View File

@@ -1,4 +1,4 @@
#!/usr/bin/python3
#!/usr/bin/env python3
# run from dir above changedetectionio/ dir
# python3 -m unittest changedetectionio.tests.unit.test_notification_diff

View File

@@ -1,4 +1,4 @@
#!/usr/bin/python3
#!/usr/bin/env python3
# run from dir above changedetectionio/ dir
# python3 -m unittest changedetectionio.tests.unit.test_restock_logic

View File

@@ -1,4 +1,4 @@
#!/usr/bin/python3
#!/usr/bin/env python3
# run from dir above changedetectionio/ dir
# python3 -m unittest changedetectionio.tests.unit.test_notification_diff

View File

@@ -1,4 +1,4 @@
#!/usr/bin/python3
#!/usr/bin/env python3
from flask import make_response, request
from flask import url_for
@@ -76,6 +76,18 @@ def set_more_modified_response():
return None
def wait_for_notification_endpoint_output():
'''Apprise can take a few seconds to fire'''
#@todo - could check the apprise object directly instead of looking for this file
from os.path import isfile
for i in range(1, 20):
time.sleep(1)
if isfile("test-datastore/notification.txt"):
return True
return False
# kinda funky, but works for now
def extract_api_key_from_UI(client):
import re

Some files were not shown because too many files have changed in this diff Show More