Notable changes:
- UI - show queued job and manage preferred queue using drag and drop #1476
- Stability improvements - recover from redis, etcd, mongodb disconnections and crashes #1433
Fixed bugs:
- trigger-service ETCD watch disconnected after ETCD failover #1558
- building algorithm - an algorithm version was not created after a successful build #1555
- pipeline driver queue - recovery when all jobs are in Preferred queue #1547
- debug: debugged algorithm error is not reported #1474
- possible inconsistency upon algorithm queue crashes #1382
- java code api does not work #1358
- Ingress does not work in kubernetes 1.22 #1354
- Swagger api /pipeline/results and /pipeline/status limit their responces to 1000 jobs #1190
- UI - Gateway node should have default "stateType": "stateful" #1562
- UI - datasource copy button version/snapshot generate wrong json #1404
- Recover from Redis disconnection #1433
Closed issues:
- User ordering for pending jobs - pipeline-driver-queue #1449
- Make algorithm loading errors visible in dashboard #1168
- Add storageClass placeholders in helm chart #1557
- Add active dashboard #1512
- Add specific TTL for active pipelines #1485
Merged pull requests:
- log by last task details #1569 (golanha)
- validate gateway node type if set or set it to stateful #1563 (yehiyam)
- add force flag to kaniko to fix cgroups v2 issue #1561 (yehiyam)
- exit on etcd watch error #1560 (yehiyam)
- support filtering managed with no tag #1554 (golanha)
- add queue count #1553 (golanha)
- add pipeline name to exec/jobs #1552 (yehiyam)
- reduce memory in managed insertion #1551 (yehiyam)
- fix job invalid status #1550 (yehiyam)
- Pdq-fixes #1549 (yehiyam)
- fix status if not set correctly py pipeline driver #1548 (yehiyam)
- fix consumer initialization order #1546 (yehiyam)
- order bootstrap init #1545 (golanha)
- add pipeline name to discovery #1544 (yehiyam)
- concurrency updates #1538 (yehiyam)
- restart on error #1537 (golanha)
- add exec jobs api #1528 (yehiyam)
Notable changes:
- Add output node type. Allows to collect output from multiple nodes in the pipeline
- Add ability to prioritize queued jobs by jobIds, pipeline name and tags #1449
- Usability updated to UI
Merged pull requests:
- update etcd package to prevent watch and get race condition #1535 (yehiyam)
- handle pipeline-driver scale #1530 (yehiyam)
- upgrade dvc #1525 (yehiyam)
- fix python build on slim #1523 (yehiyam)
- paging to the end of the list, and returning count next prev #1519 (golanha)
- count instead of jobs list and expose filter by pipeline and tag #1513 (golanha)
- add counter metrics #1517 (yehiyam)
- add buildConfig resources #1516 (yehiyam)
- add active ttl #1486 (yehiyam)
- allow self signed s3 #1509 (golanha)
- update nodejs wrapper to 2.0.61 #1508 (hkube-ci)
- remove get dependency python and nodejs #1396 (golanha)
- Job aggregation api #1493 (golanha)
- update java wrapper to v2.1.10-SNAPSHOT #1502 (hkube-ci)
- update java wrapper to v2.1.9-SNAPSHOT #1501 (hkube-ci)
- use getAll on clean etcd #1457 (golanha)
- fix loading order #1487 (golanha)
- job priority #1466 (golanha)
- prevent error message if unable to find snapshot #1480 (yehiyam)
- handle error from debugged algorithm #1478 (yehiyam)
- update java wrapper to v2.1.8-SNAPSHOT #1479 (hkube-ci)
- Optimizer #1430 (golanha)
- flow-input #1419 (nassiharel)
- fix: webhooks levels #1464 (nassiharel)
- disable datasources mount if service is disabled #1456 (yehiyam)
- apply resources to pipeline driver #1458 (yehiyam)
- fix negative free gpu resouces #1441 (yehiyam)
- setting.xml plugin repository points to artifacts-registry #1450 (golanha)
- feat: exit on redis disconnection #1437 (nassiharel)
- feat: improve delete algorithm #1434 (nassiharel)
- algorithm kind #1424 (nassiharel)
- update python wrapper to hkube-python-wrapper==2.3.0.dev1 #1442 (hkube-ci)
- Update-logger-master #1432 (yehiyam)
- pipeline kind #1421 (nassiharel)
Fixed bugs:
- when stopping running pipeline the job stay active in the pipeline driver #1534
- fail to execute pipeline with output in the second time #1531
- pipeline driver does not scale-up #1529
- Algorithm queue - when the algorithm starts for the first time after creating an error is displayed #1443
- Rebuild failed Error: unable to find build ->... #1412
- Streaming Gateway send message issues #1376
- return no tags in managed tag aggregation #1526 (golanha)
- algorithm operator version #1520 (golanha)
- Fail to build algorithm with "baseImage":"python:3.7-slim" #1522
- when creating Tensor board optuna board board is created with the same path #1518
- fail to execute node with input object that contains flowinput and batch on previews node results #1365
- pipe in pipe request are block by API server #1521
- No resources for buildConfig #1515
- Cannot disable datasource in worker when datasource PVC is not available #1455
- algorithm-operator does not set pipeline-driver resources #1453
- Pipeline wizard -in the JSON streaming "flows" should be under "streaming" #1397
- if an algorithm is in a terminating state it should be ignored #1348
- pipeline wizard #1336
- DVC client fails to move files #1299
- webbrowser crashes viewing dashboard after heavy run #1183
- API-server disconnected to the watcher #1129
- streaming: stateful node memory leak when reconnecting to stateless node #1108
- streaming - fail maintains high rage flow over time #1064
- Streaming stateless NodeJS . pods that were stooped does not restart #1252
- pipeline "progressVerbosityLevel" does not change the webhook verbosity level #1463
- Caching and debug node buttons should be disabled in the streaming pipeline #1335
Closed issues:
- Allow user to specify debug nodes in pipeline execution #1324
- allow multiple pipelines with output #1533 (yehiyam)
- Add specific TTL for active pipelines to v1.3 #1482
- Create output kind #1448
- Job search API #1337
- Algorithm Queue auto-removal #1281
- Add ability to filter out pending jobs (part feature, part bug) #1186
- Pipeline with conditional algorithms (if statements) #1173
- Resource consumption optimizations #966
- algorithm nullable properties #1271
- streaming #585
Implemented enhancements:
- FlowInput Size Issue #1303
v2.1.93(2021-11-7)
- Gateway API - for streaming so u will able to create a rest API to a streaming channel directly from hkube routing
#1233
- Debugging improvements - no need to create a debug algorithm. Every algorithm can be debugged by the user by just setting a flag in the pipeline spec design
- update logger
#1431
- add throttle to k8s error
#1422
- feat: add builds clean
#1418
- add extra registries
#1364
- feat: add errors log info
#1417
- disable enableCheckStalledJobs in worker
#1416
- fix algorithm_queue_scale
#1415
- feat: fix rebuild issue
#1414
- queues snapshot
#1387
- Streaming gateway
#1413
- algorithm-queue scale improvement
#1405
- update nodejs wrapper to 2.0.50
#1409
- Pipeline status bug
#1408
- update logger and config for fluentbit
#1403
- feat: improve drivers scale
#1395
- update core/algorithm-builder/environments/python/wrapper/requirements.txt to hkube-python-wrapper==2.2.0.dev9
#1394
- update core/algorithm-builder/environments/python/wrapper/requirements.txt to hkube-python-wrapper==2.2.0.dev7
#1393
- clean openshift build
#1383
- add emptyDir and configMap mounts
#1379
- update core/algorithm-builder/environments/java/wrapper/version.txt to v2.1.7-SNAPSHOT
#1389
- Support range java
#1380
- update parser to handle batch on object with flowInput
#1378
- don't retry for algorithm execution code api
#1373
- update java wrapper to v1.0.20-SNAPSHOT
#1372
- add more logs
#1361
- java version in one place
#1366
- fix ingress
#1363
- fix: reconciler
#1360
- feat: add queue active by timestamp
#1353
- feat: add search limit
#1356
- add ingress type
#1355
- 2.0 snapshot
#1351
- support log message from debug algorithm
#1342
- update python wrapper in builder to version 2.2.0.dev5
#1346
- update nodejs wrapper to 2.0.45
#1345
- debug hkube api
#1316
- fix algorithm container status is undefined message
#1333
- feat: add rerun
#1331
- update python wrapper in builder to version 2.2.0.dev3
#1332
- update units-converter to allow byte values
#1330
- handle raw run of cached
#1326
- add debugOverride
#1321
- add maxWorkers to algorithm spec
#1322
- feat: algorithm-queue scaled
#1320
- feat: algorithm-queue scale
#1318
- Swaggers merge
#1310
- trigger consumer job when reconnected
#1317
- add pipeline type for caching debug
#1313
- exec algorithm as debug
#1309
- feat: streaming flow validation
#1306
- fix getting algorithm template
#1307
- algorithm-debug
#1264
- pipeline-driver-queue latency issue
#1301
- pin dvc version
#1302
- fix empty docker creds on openshift 3.11
#1292
- Rest Refactoring
#1291
- feat: improve job stop/pause/result
#1290
- Hkube gc
#1287
- add bodySizeLimit
#1288
- feat: move caching-service to api-server
#1284
- feat: ceil drivers amount
#1283
- feat: jobs search
#1277
- feat: driver tests
#1280
- Pipeline drivers scale
#1259
- feat: worker bugs
#1279
- update nodejs wrapper to 2.0.41
#1278
- feat: check algorithm container status
#1275
- feat: resolve createIndices
#1273
- feat: resolve edges
#1272
- feat: delete algorithm nullable properties
#1268
- feat: adjust envs
#1262
- feat: add close method
#1270
- feat: reservedMemory parse int
#1265
- feat: driver_tests
#1266
- feat: dataSource -> spec
#1261
- add healthcheks
#1263
- Algorithm queue balancer
#1251
- Streaming gateway
#1234
- feat: add labels and annotations
#1228
- update kubernetes client package to fix k3s issue
#1257
- Concurrent pipelines
#1250
- use later nodejswrapper
c02659e
- Revert "Support range"
4635b04
- Support range
7907765
v2.0.146(2021-04-28)
- Streaming pipelines - Add the ability to create pipelines that run continuously and pass the data along the flow
#263
- Data resource - This feature will enable managing data sources and applying them to pipelines on top of dvc
#965
- Redesign Data transportaion - moving data transportaion from object storage to ZMQ
#771
- fix dependency_install_cmd openshift
#1245
- fix: pipeline-driver-queue concurrent pipelines
#1244
- remove zip file after download complete
#1242
- converted tmp root dir to absolute path
#1240
- updated hkube/db version on all services
#1237
- Fix error message
#1235
- java wrapper version 2.0.00
#1232
- update python wrapper in builder to version 2.1.0
#1230
- fix no git user
#1229
- added validation for uploaded meta without data file
#1227
- update nodejs wrapper to 2.0.40
#1226
- updated dependencies
#1225
- Convert to absolute paths
#1224
- update nodejs wrapper to 2.0.39
#1223
- update nodejs wrapper to 2.0.38
#1222
- added dvcignore content on create datasource
#1220
- handle no data on webhook
#1219
- fix input display in ui for batch results
#1217
- Fix running snapshot
#1216
- feat: node wrapper version
#1213
- datasource-files-path
#1214
- Bugfix subdirs
#1212
- update python wrapper in builder to version 2.1.0.dev37
#1211
- feat: logger version
#1209
- feat: add total queue size
#1208
- update-db
#1207
- feat: add logger
#1205
- use workerImage if available
#1135
- feat: change scaler config
#1195
- increase list from api to 100000
#1200
- ignore-undefined
#1203
- use new graph store
#1198
- feat: add reservedMemory
#1197
- use reservedMemory for allocation
#1196
- update python wrapper in builder to version 2.1.0.dev34
#1194
- support storageInput for code-api
#1193
- added support for updating credentials
#1192
- update python wrapper in builder to version 2.1.0.dev33
#1189
- Streaming metrics
#1187
- fixed using external github repository
#1185
- Bugfix snapshot
#1184
- fix: issues
#1182
- Refactor external datasource
#1181
- get results in webhook handler
#1175
- update python wrapper in builder to version 2.1.0.dev32
#1180
- fix: enhance scaler
#1178
- update python wrapper in builder to version 2.1.0.dev31
#1179
- added config.template
#1169
- Auto scaler
#1170
- update python wrapper in builder to version 2.1.0.dev30
#1164
- update python wrapper in builder to version 2.1.0.dev28
#1161
- update python wrapper in builder to version 2.1.0.dev27
#1159
- feat: add streaming_roundtrip_metric
#1158
- update python wrapper in builder to version 2.1.0.dev26
#1157
- Support internal kind
#1148
- fix-settings
#1156
- feat: improve scaling metrics
#1155
- feat: improve durations scaling
#1154
- accurate scale
#1145
- update python wrapper in builder to version 2.1.0.dev25
#1153
- feat: DAG
#1152
- update python wrapper in builder to version 2.1.0.dev24
#1151
- update python wrapper in builder to version 2.1.0.dev22
#1147
- update python wrapper in builder to version 2.1.0.dev21
#1146
- Datasource failure cleanup
#1141
- update python wrapper in builder to version 2.1.0.dev19
#1142
- Streaming auto scaler
#1130
- Bugfix datasources
#1140
- add datasource envs and mounts to debug worker
#1139
- update python wrapper in builder to version 2.1.0.dev18
#1138
- Datasource mounted cleanup
#1137
- implemented delete from git and storage
#1115
- Support datasource external storage
#1134
- add optional api-server prefix
#1126
- clean stopped jobs
#1131
- Cli endpoint
#1121
- Stored pipeline merge
#1128
- fix configmap algorithm operator
#1127
- throw if no hkube configmap
#1123
- update python wrapper in builder to version 2.1.0.dev17
#1117
- fix component
#1116
- update python wrapper in builder to version 2.1.0.dev16
#1113
- Datasource
#996
- add stop to isDiscovery
#1111
- feat: streaming retry
#1107
- update python wrapper in builder to version 2.1.0.dev14
#1105
- feat: handle retry on streaming
#1102
- execute copy on docker
#1103
- use correct project for openshift build
#1092
- update python wrapper in builder to version 2.1.0.dev13
#1099
- adding stopping command
#1097
- feat: improve worker logs
#1090
- fix: improve disallow changing algorithm type
#1094
- feat: improve worker logging-proxy
#1087
- update python wrapper in builder to version 2.1.0.dev12
#1086
- upgrade nodejs wrppaer to 28
#1085
- algorithm-deps-install
#1081
- update python wrapper in builder to version 2.1.0.dev11
#1082
- update python wrapper in builder to version 2.1.0.dev10
#1079
- feat: enable keys with dots!
#1077
- feat: ignore statful scale down
#1073
- feat: enable big data debug
#1071
- fill missing data that was too large for debug algorithm
#1069
- feat: fix trigger and boards types
#1068
- use settings.xml at build
#1057
- pass STORAGE_RESULTS_THRESHOLD to pd
#1066
- update python wrapper in builder to version 2.1.0.dev9
#1065
- feat: add info about node input
#1060
- feat: fix run node twice
#1062
- fix ignoring debug algorithm
#1061
- improve scale up/down
#1054
- remove m2 dir creation in java builder
#1049
- db migration
#1039
- ignore .pyc files
#1050
- use internal artifactory at java build .... bump version [skip ci]
#1040
- use internal artifactory at java build
#1040
- update python wrapper in builder to version 2.1.0.dev8 .... bump version [skip ci]
#1047
- update python wrapper in builder to version 2.1.0.dev8
#1047
- run npm ci inside docker
#1046
- feat: minor bugs
#1044
- fix nodejs node_modules
#1041
- update python wrapper in builder to version 2.1.0.dev7 .... bump version [skip ci]
#1038
- update python wrapper in builder to version 2.1.0.dev7
#1038
- ci-try
#1033
- feat: add buildId to version after successful build
#1029
- fix: : Cannot set property 'error' of null, remove custom stream child
#1028
- update nodejs wrapper to 2.0.25
#1026
- feat: delete all .travis.yml
#1027
- add stream flow, improve scale up/down
#1023
- Streaming custom flow
#1021
- feat: trigger builds
#1015
- update python wrapper in builder to version 2.1.0.dev2
#1020
- update python wrapper in builder to version 2.1.0.dev1
#1019
- add support for cyclic graph, streaming custom flow
#1010
- V2 streaming
#892
- add pullSecret for openshift builds
#1005
- update python wrapper in builder to version 2.0.30
#1007
- 2.0-SNAPSHOT
#1000
- fix: bug with cache nodes
#1003
- fix remove file on apply, fix update algorithm
#999
- feat: algorithm guarantee quota
#988
- feat: make versions work with id
#992
- support java build in disconnected environment
#995
- fix wrong validation of algorithmEnv
#994
- feat: add pipelineKind schema
562013a
- Github actions CI
57b600b
- feat: update CHANGELOG.md
7f55078
v1.3.147 (2020-10-20)
Fixed bugs:
- Swagger api exec/algorithm does not accept any input that isn't an empty list #985
- Fail to "Run Node" when data received from flowInput #969
- Debug url does not contain ingress prefix #961
- hkubectl Issues #881
- stopping node that contain the ETCD LEADER cased the HKUBE to stop working #927
- Support self signed repositories for kaniko build #427
- python wrapper does not work with s3 #887
- Pipeline driver queue got stuck after ETCD disconnection #942
- deleted algorithm are not deleted from the ETCD /algorithms/queue/ #928
- Replying for the "get from peer" request cause the algorithm to crash due to out of memory #921
- download node result (if it contains more then 1 MB ) should perform the same as download pipeline result #910
- Algorithm Queue uncaughtException: Watcher canceled: #894
- Wrong monitor-server path in helm chart #945
- Thing to solved for 1.5 #922
- build algorithm new version change the algorithm entry point #880
- "Run Node" on batch #919
- when handling big size data and reaching the cluster memory limits the Task executor does not release idel pods #932
- java algorithm code APIs should receive MAP or list and return MAP #889
- Low TTL #863
- UI crash when one user delete an object that other user view #856
- 700MB input case the Resource Manager to stuck #920
- Worker cash size casing the pod to terminate #904
- Get data from Peer function when dealing with big data #903
- pipeline with batch stuck with node "storing" #902
- Debug worker and Algorithms should have same default discovery encoding #842
- Missing timeout on debug worker client side of discovery communication #841
- Unable to delete algorithm properties once added #792
- Known UI Issues #525
- API server miss validate node input #893
- API server crash when the storage is too big #740
- cannot create storage buckets when clusterName has a period in the name #655
- Redis Storage filled up #790
- hkubectl sync watch does not copy the source file to the algorithm mount storage #890
- Java build logs size #855
- Etcd issues #793
- sometime algorithm does not get results from previous alg #885
- Node worker fail to read batch result #886
- GPU pipeline take all GPU resources none GPU pipeline are pending (evens that the none GPU resource are available ) #795
- Field pipeline staid in "pending" (Due to fail communication to ETCD) #794
- opengl algorithm does not work #780
- Different priority Pipelines with Different algorithm #722
- Tensorboard status not updated to running when running in a different namespace #865
- API server does not read configuration from config file #797
- Github example algorithm should be using the new wrapper #826
- Can't install HKube on Kubernetes 1.16 #540
- concurrent Pipelines works only within the experiment #746
- Known issues #477
- caching wait any - input equal null #852
- in debug "hkubeapi.start_algorithm" and "start_stored_subpipeline" with "resultAsRaw=True" does not return results #830
- this._nodes.getWaitAny is not a function #747
- wait any + batch return different result each run #825
- randomPipeStored pipeline got stוck once in few hours #779
- authorization error when working with private repository in gitlab (works with github ) #840
- Algorithm Close abnormal #604
- Duplicated sub-piplines #603
- when node has more then one inputs and one of them is "flowInput=null" the node is being skipped #832
- pause and resume batch on batch pipeline #823
- fail to build gitlab algorithm #822
- pipeline driver out of memory #809
- run node (execute caching) on a batch #781
- validate pipeline input when input equal null #737
- sometime worker complete its job but does not change its status to done #718
- worker displayed as active when pipeline failed #686
- Graph is Unreadable in some cases #656
- github build ignor the commitid #770
- Experiment Issues #744
- delete "main" experiment cause the dashboard to crash #778
- invalid task status exit #709
- subpipeline tracing fix #591
- Experiment name validation #791
- GPU - clean unused worker #783
Closed issues:
- Changing attributes or image of an algorithm should restart any of the algorithm pod where worker is in ready state. #319
- Add support for ImagePullSecrets #924
- Algorithm definition should include the output mime type (jpg, json,txt .....) #877
- while algorithm is serving other algorithm "get from peer" request it should not go down #911
- Improve travis yml #872
- Debug disconnection #845
- Check Node 14 - and upgrade #835
- if the necessary data can be found in the worker or the cache or the get from peer fail once . the "get from peer" should not be executed #912
- Errors in Worker #848
- debug and run existing algorithm with single click #705
- add --wait to hkubectl pipeline run #434
- Download result #815
- Missing useful batch operations for multiple batch parameters #660
- Job Graph: add sub-pipelines and code-algorithms links #750
- If pipeline was unable to be executed due to resource problem the rezone should be displayed #803
- shorten job and task id #860
- Java algorithm builder #789
- Show tensorboard while algorithm is running. #681
- Streaming #672
- View analysis of algorithm using tensorflow , in Tensorboard #610
- the fail logic in streaming pipeline should act differently so hkube will create a new algorithm container on each crash #590
- the batch phases in the streaming pipeline should increase and decrease its instances according to the pipeline workload #589
- the streaming layer should have an abstraction layer so it will be possibly to switch between streaming framework seamlessly #588
- Pipeline repository management #72
- Caching Service #814
- Code-API catch WS events properly #851
- Add type to a pipeline that in debug/dev mode #843
- Task Level HW Resource Requirements #749
- update the hkubectl #628
- Etcd Tuning #15
- Stop build #834
- delete pipeline should stop all active and pending pipeline #820
- add kibana and grafana dashboards automatically on HKube deployment #666
- missing pagination for batched algorithm #600
- Python Wrapper: get access to pipeline's flowInput #663
- FlowInput #811
- algorithm with no github commit #819
- Code API #513
- Trigger execution flowInput is incorrect #679
- Algorithm version is not specified in dashboard "Worker" details #658
- Missing JOB ID for batch algorithm in Node Information-> Input output Details #599
- Etcd UI short timeout #729
- Missing easy access to tasks logs after worker is down #661
- Automatically create new algorithm if image changes #385
- thinking off the ability for auto build algorithm after github hooks #372
- add the ability to build algorithm from url #371
- Allow runing a pipeline in reoccurring mode, starting again when pipeline ends. #343
- Add support for builds on OpenShift #755
- Support for OpenGL algorithms #701
Merged pull requests:
- copy to local repository #989 (golanha)
- remove -it on docker get dependency #987 (golanha)
- Support offline installation #986 (golanha)
- bring debs for dockerfile install #975 (golanha)
- control pipeline-driver resources #984 (yehiyam)
- change default algorithms resources #980 (yehiyam)
- add ALGORITHM_DISCONNECTED_TIMEOUT_MS to template #978 (yehiyam)
- fix: caching flowInput #971 (NassiHarel)
- add algorithm memory as environment variable #963 (golanha)
- add warning for algorithm-queue not created #967 (yehiyam)
- update python wrapper in builder to version 2.0.28 #964 (hkube-ci)
- add ingress prefix to debug url #962 (yehiyam)
- feat: add memoryCache #954 (NassiHarel)
- update nodejs wrapper to 2.0.22 #953 (hkube-ci)
- update nodejs wrapper to 2.0.21 #952 (hkube-ci)
- update nodejs wrapper to 2.0.20 #951 (hkube-ci)
- update python wrapper in builder to version 2.0.24 #950 (hkube-ci)
- update nodejs wrapper to 2.0.19 #949 (hkube-ci)
- update nodejs wrapper to 2.0.18 #948 (hkube-ci)
- algorithm terminating take too long #941
- feat: add get custom data with metadata #947 (NassiHarel)
- feat: change to reserved memory #958 (NassiHarel)
- update python wrapper in builder to version 2.0.23 #946 (hkube-ci)
- update etcd package #943 (yehiyam)
- update python wrapper in builder to version 2.0.22 #944 (hkube-ci)
- add option to use _auth instead of _authToken in npmrc #940 (yehiyam)
- add imagePullSecret #938 (yehiyam)
- update python wrapper in builder to version 2.0.21 #939 (hkube-ci)
- feat: improve apply function #937 (NassiHarel)
- update python wrapper in builder to version 2.0.20 #936 (hkube-ci)
- feat: algorithm apply #931 (NassiHarel)
- Download pipeline and task results #930 (NassiHarel)
- update python wrapper in builder to version 2.0.19 #935 (hkube-ci)
- feat: improve cache-pipeline #929 (NassiHarel)
- update python wrapper in builder to version 2.0.18 #934 (hkube-ci)
- change job name guid to 5 chars #933 (yehiyam)
- update python wrapper in builder to version 2.0.17 #926 (hkube-ci)
- feat: handle algorithm serving state #923 (NassiHarel)
- update python wrapper in builder to version 2.0.15 #918 (hkube-ci)
- fix result of subpipeline to algorithm #914 (yehiyam)
- handle large queue size when persisting to redis #907 (yehiyam)
- update python wrapper in builder to version 2.0.12 #909 (hkube-ci)
- update python wrapper in builder to version 2.0.9 #901 (hkube-ci)
- Pipeline-driver recovery #900 (yehiyam)
- update nodejs wrapper to 2.0.17 #899 (hkube-ci)
- feat: improve progress, prepare for streaming #898 (NassiHarel)
- update etcd package #896 (yehiyam)
- update nodejs wrapper to 2.0.16 #897 (hkube-ci)
- update python wrapper in builder to version 2.0.8 #895 (hkube-ci)
- update nodejs wrapper to 2.0.15 #891 (hkube-ci)
- feat: split test files in PD, for simpler V2 merging in the future #888 (NassiHarel)
- update nodejs wrapper to 2.0.14 #882 (hkube-ci)
- feat: seperate api validator to files #884 (NassiHarel)
- Jaeger openshift #883 (yehiyam)
- fix: remove uuid dependency, replace with @hkube/uid #879 (NassiHarel)
- Tensorboard namespace #878 (yehiyam)
- feat: reduce round-trips to Etcd and Redis #875 (NassiHarel)
- update nodejs wrapper to 2.0.13 #874 (hkube-ci)
v1.2.186 (2020-04-30)
Implemented enhancements:
- Max concurrent pipelines #741 (NassiHarel)
- Tensorboard #652
Fixed bugs:
- Fail algorithm build need to remove the Stop button #774
- Build algorithm fail from time to time #769
- Partial gpu support #772 (yehiyam)
- Allow user to set node graph direction #642
- triggered pipeline input data #756
- cron pipeline priority default overwrite the stored #754
- GUI - create pipeline need to change the concurrent Pipelines to object #742
- Download pipeline result file contain binary data #734
- build algorithm from github long name #713
- Code API "Node information" does not displayed #625
- Fail to execute sub-pipline #735
- inconsistent flowInput validation #725
- workers of a batch in a failed pipeline still running #638
- No errors in dashboard when triggers fail #739
- disable Hkube metadata #728
- resume multiple batch pipeline #719
- pipeline execution input overwritten by pipeline original inut #716
- github private repository need to add username #715
- Github branch does not create new build #714
- No taskId on refreshing node's logs. #702
- pre schedule of algorithm that timed out and exited with non 0 code #639
- Can't get result's indexed item #662
- tensors board #687
- Configure pipeline to call other pipeline(or sub pipe line) #605
- Allow user to specify different npm repository for builds #387
- Error applying algorithm defined from github #699
- invalid task status exit #710
- algorithm-builder ignores baseImage property in git mode #695
- No pipeline stats after experiments feature #693
- Add Trigger type to pipelines Types #691
- Logs fetched twice #689
- add form data to components in the UI #582
v1.2.132 (2020-01-28)
Fixed bugs:
- subpipeline not shown in dashboard (no experiment prefix) #680
- Graph nodes color is sometimes wrong #657
Implemented enhancements:
- Support using arbitrary base images #563
Fixed bugs:
- Batch on Batch #676
- Algorithm retry overwrite batchTolerance #674
- git webhook creates an empty version which cause wired behaviors #607
- Nodes in pipeline graph are rendered in wrong order #358
- workers dont get aborted when there is a requirement for another algorithm #201
Closed issues:
- Missing option for complete pipeline result including non-final nodes #673
- Add TTL for each Algorithm #601
- Allow defining TTL in Algorithm/Node level #342
v1.2.127 (2020-01-23)
Closed issues:
v1.2.109 (2020-01-16)
Fixed bugs:
- Algorithm version window display other algorithms version of similar algorithm names #624
- input for an algorithm after skipped node #583
Closed issues:
- copy worker ID to clipboard - from Node Information form #598
v1.2.108 (2020-01-15)
Closed issues:
- ParentID for each Algorithm- feature request #597
v1.2.107 (2020-01-14)
Implemented enhancements:
- Need to update the storage-cleaner #568
Fixed bugs:
- no error message when trying to run non existing algorithm or pipeline with code api #510
Closed issues:
- Delete related data after algorithm deletion #627
- Cache storage data in worker #576
- add simple API for storage #554
- Add Graph API & Improve the pipeline graph #545
- Expose API for gitlab and github webhook upon push #518
- Expose through UI, building algorithm from code in git #517
- Add pause/resume pipeline #344
- Code API: execute command for creating dyanmic algorithm from worker #190
v1.2.103 (2020-01-13)
Fixed bugs:
- Stop pipeline while running "results" status is completed #620
- Algorithm with non-exist image #500
- rerun pipeline from cached pipeline #326
Closed issues:
- Dashboard API to change the logger verbosity level #609
- We should add metrics and scores to algorithms #592
- Put result to storage #580
- invalid task status exit #548
v1.2.96 (2020-01-05)
Fixed bugs:
- Update algorithm version while running force = false #648
v1.2.92 (2020-01-02)
Implemented enhancements:
- Github issues cleanup #396
v1.2.79 (2019-12-25)
Implemented enhancements:
- Create package for removing duplicate code #573
Closed issues:
- There is no info on batch errors #284
v1.2.77 (2019-12-24)
Closed issues:
- Need to think about common code between pipeline-driver and algorithm queue #166
v1.2.76 (2019-12-24)
Fixed bugs:
- batch tasks are not canceled when pipeline fails #578
- Jobs cleaner not working #532
- not constant execId and pipleineId types in the worker on outgoing messages #521
- Batch errors are not visible #366
- algorithm queue writes CrashLoopBackOff every 15 seconds #239
- Queue cant accept new job after job arrived with state stop #225
Closed issues:
- Remove prefix raw- for raw pipeline #615
- Add property "Tags" & "Types" for each pipeline execution #614
- Split tests into files #520
- Recovery after deploy #512
- Versioning for algorithms. #487
- Merge state with status #422
- we should replace our wrapping implementation for algorithms to work as a package #347
- Running time for each algorithm #292
- algorithm builder: features implementation #289
- Nodes and pipeline results #287
- Task Executor Features #270
v1.2.72 (2019-12-17)
Closed issues:
- Algorithm name length issue #259
v1.2.66 (2019-12-16)
Fixed bugs:
- ENAMETOOLONG: name too long - due to subpipe of subpipe #562
- Cannot run new algorithm after image update #509
Closed issues:
- Update algorithm name #530
- Allow stopping a worker from UI #321
- Create hkube flow story with blog #306
- Download source code #302
- Large CPU algorithm will not be scheduled #233
- When stopping pipeline need to clear data from etcd #200
v1.2.63 (2019-12-12)
Fixed bugs:
- algorithm image name: "must not have leading or trailing whitespace" #581
Closed issues:
- Fix the clean job api #569
v1.2.62 (2019-12-12)
Closed issues:
- Improve the dashboard #544
v1.2.60 (2019-12-11)
Closed issues:
- insert to storage on result #571
v1.2.58 (2019-12-08)
Closed issues:
- Add Event entity #234
v1.2.57 (2019-12-02)
Fixed bugs:
- algorithm versions list is copied to all open algorithms #593
Closed issues:
- move examples/algorithm-example-python/ to ----> own repository #408
v1.2.56 (2019-11-27)
Implemented enhancements:
- Direct call from dashboard to API-Server #570
v1.2.44 (2019-11-10)
Closed issues:
- Move to husky? #483
v1.2.26 (2019-10-28)
Fixed bugs:
- cannot stop code-api algorithm #501
- out of memory in the pipeline driver #490
- debug worker #431
- Image Pull Backof #224
Closed issues:
- check the bull removeOnFail #467
- Preschedule: run child node before completion. #454
- Moving to docker in docker instead of using machine's docker #368
- Add option for algorithms to execute another algorithms using API #288
- Update doc on hkube.io #191
v1.2.11 (2019-10-03)
Implemented enhancements:
- Change the way jobs are scheduled #232
Fixed bugs:
- no loading screen #495
Closed issues:
- send last pipeline result to algorithm #269
- create timer for bootstrap on algorithm #241
- Check high availability of K8s masters #217
- Reduce intervals for resource-manager and task-executor #205
v1.2.9 (2019-09-19)
Fixed bugs:
- inconsistent datatypes return on code api execId and pipelineId #502
v1.2.2 (2019-09-04)
Implemented enhancements:
- Rewrite etcd.hkube #184
Fixed bugs:
- pipeline driver fails and go to recovering mode and no other driver takes the job #469
- turning on a cron job for a pipeline #442
- race condition in dashboard #430
- create a hot worker and than delete it #278
v1.1.1084 (2019-08-28)
Fixed bugs:
- Algorithm readme not updating #423
v1.1.1081 (2019-08-26)
Fixed bugs:
- download pipeline results dosent work #468
- stalled job #463
- ui crashes if you click on a view before the data was loaded #460
- sub pipeline from the algorunner dosent work #458
- cpu usage when adding an algorithm #444
- node input output details doesn't show #441
- pipeline description #405
- worker page pod additional details #402
- deleting the pipeline-driver-queue when executing a new pipeline #290
Closed issues:
v1.1.1080 (2019-08-21)
Closed issues:
- we should verify that checksum operation for algorithm build is not heavy #488
v1.1.1072 (2019-08-18)
Closed issues:
v1.1.1068 (2019-08-13)
Fixed bugs:
- dashboard crashes #473
Closed issues:
- Add watch for stop build #453
v1.1.1061 (2019-08-07)
Fixed bugs:
- building a node algorith, dosent work #457
v1.1.1058 (2019-08-07)
Fixed bugs:
- building a node algorithm #464
v1.1.1046 (2019-08-04)
Fixed bugs:
- Stop algorithm build #452
v1.1.1037 (2019-07-30)
Fixed bugs:
- pipelines page #443
v1.1.1033 (2019-07-30)
Fixed bugs:
- set ingress permission in cleaner role #432
v1.1.1031 (2019-07-29)
Fixed bugs:
- all pages "show number of rows in a table" #407
v1.1.1030 (2019-07-28)
Fixed bugs:
- fix docker image parsing #433
- edit description. both on pipelines and algorithms #404
- drivers page- sort #403
v1.1.1022 (2019-07-17)
Fixed bugs:
- Store Algorithm Readme #393
- python automated build ignores exceptions on ws.send #383
- Fix auto complete #382
- add new algorithm from build #308
Closed issues:
- stalled in node lead to failed pipeline #412
- json parser for algorithm from build #309
- Pipeline Driver not deleting /jobs/tasks/<JobId> #277
v1.1.1021 (2019-07-08)
Closed issues:
- Suggestions #279
v1.1.1016 (2019-06-30)
Closed issues:
- update base image tag automatically #399
v1.1.1015 (2019-06-30)
Fixed bugs:
- add/edit algorithm description #406
- input output details on a worker #401
- view pod logs form the workers tab #336
- the UI is very slow #333
- pipelines page- show pipeline details #327
v1.1.1014 (2019-06-30)
Fixed bugs:
- Fix API CLI documentation #369
v1.1.1010 (2019-06-25)
Fixed bugs:
- post request endpoints are not correct #337
- update algorithm #335
- running pipeline with input #332
- creating a pipeline with the wizard #331
- filter the jobs table by some key then run "simple" pipeline- ui craches #329
v1.1.1009 (2019-06-24)
Merged pull requests:
v1.1.1007 (2019-06-23)
Closed issues:
- Restore missing features #381
v1.1.1003 (2019-06-19)
Merged pull requests:
- Add issue templates #363 (denvash)
- Create CODE_OF_CONDUCT.md #361 (NassiHarel)
v1.1.1001 (2019-06-18)
Fixed bugs:
- workers page open pod details crashes the page #334
- create pipeline in the ui doesn't show nodes to the pipeline json #330
- jobs page - Unnecessary filter icon at the Pipeline status column #328
- Fix issue with pipeline rerun #312
Closed issues:
- Switch to WS as default #325
- Add memory human readable property #311
- Dashboard Refactor #310
- Fix issues with env: entry-point and worker-protocol for debug #307
- reduce node_modules size when building docker by: npm i --production #285
Merged pull requests:
v1.1.1 (2019-06-04)
Implemented enhancements:
- Add a tool-tip to icons in main menu bar when menu is collapsed #315
- Move to Node v10 #210
- Add rate-limiter to api-server #21
Fixed bugs:
- Problem in selected Algorithm logs tab, in pipeline graph view. #313
- worker that got a stopped job dosent clear well #296
- simulator get stuck only monitor server restart helps for not long #295
- error key in json was added even when no error is the pipeline #286
- trigger service will not exit if unable to reconnect to ETCD #275
- pipeline should report finish only if result was successfully saved to storage and not before. #257
- driver should report error if process failed due to memory usage more than 256MB #256
- reconciler should kill redundant pipeline-drivers if defined less than actual running drivers #255
- a pipeline that worked stopped working and fails after crashLoopBackOff #243
- add empty LocationConstraint (production bug) #240
- CrashLoopBackOff when algorithm exit and pipeline stopped #238
- sometimes storgeInfo not exists #237
- delete algorithm after all pipelines completed and pipeline was deleted is not working #235
- task executor sometimes request for double the workers it really needs #227
- resource allocation for a pod is not as described #220
- failed to start pipeline if "input" was missing from pipeline descriptor #216
- last algorithm in a batch is taking to much time to complete #215
- ttl reason for exparation time should be in a real time not epoch #212
- override stored pipeline options #209
- algorithm operator stuck when etcd is failed to connect #206
- unable to find handler for job #204
- transition is invalid in current state #203
- Cannot read property 'filter' of undefined #202
- Pipeline that failed get stopped status instead of failed #198
- delete an algorithm is not possible #197
- resource manager request more resources than aviliable #196
- algorithm that failed and threw an error stays up in the system and not getting cleaned/deleted #193
- stop pipeline doesn't work #192
- algorithm queue scores was not ordered when new job added during scoring update #188
- wait any return wrong number of results. #187
- failed algorithm return null response in the results #177
- execute a request and immediately stop it, the request remain in status "sttoping" #175
- Make triggers pipeline work the other way #173
- pipeline result is null/empty #167
- Fix AlgorithmQueue active count #163
- logs from the algorithm are sometimes dont written to the elastic search logs #148
- pod stopped working, show an error that need to be tracked #147
- max retries of an algorithm should fail the pipeline #146
- large batch size doesnt run #144
- Etcd periodic cleaner #142
- Pipeline request using pipeline which isn't stored in the repository of pipeline descriptors #43
- When stopping pipeline while worker is init #40
- worker tries to fetch null data from storage provider #39
- Run batch on node results not working due to new storage feature #38
- apak algo dont deal with stop #26
- when stopping an apak pipeline the amount of watchers is increasing #25
- exec stored with a large amount of rows in the body of the request returns a 413 error "payload to large" #20
- batch tolerance 0% sets the default to 80 #19
- inconsistency batch tolerance behaviour #18
- max retry on failing pods should return an error and close pipeline #17
- logs are taken very long to be written to ES #14
- pipelines that sometimes do not end the run (inconsistent) #13
- results of a pipeline comes with a big delay #12
- pending pipelines that were stopped before starting never being stopped #11
- get status of a pipeline after some nodes has finished #10
- worker crashes after timeout between the worker and the algorithm #8
- the '0' result or false result are not returned to the api-server #5
- inconsistent end status received after PL completed #2
Closed issues:
- Clean docker trash after build #303
- Add stream support in FS storage adapter #301
- GPU resource management seems to no work well #297
- bucket name should include cluster name also to be able support multiple clients from different clusters #294
- Need to re-enter tasks to queue after stalled #260
- Simplify API for adding algorithm to HKUBE #250
- Add description field to pipeline #245
- reduce logs on file watch error #244
- test issue #242
- validate exec raw run without trigger #231
- resource manager request for more resources the cluster have, it results in workers stuck on pending status and dosent exit alnoe #230
- batch on empty array not working #229
- validate deep calls of trigger pipelines not exceed the limitation #228
- Check why pipeline-driver has retry job #222
- Insufficient cpu for pipeline drivers #219
- Check if exec/tree still works #211
- stop pipeline if expired #186 #195 #208
- unable to create job with capital letter #199
- Add expiration time #195
- Add support for /internal/sub-pipelines & /internal/stop #194
- Review ETCD tree architecture #189
- Add delete execution after job completed #186
- Add watch state for all jobs in etcd #183
- Create common k8s pods npm package #182
- storage cleaner #181
- Implement stop in pipeline-driver-queue #180
- etcd issues when saving large data #179
- When pipeline failed, need to send stop to all workers (batch) #178
- change the pipeline name on /exec/raw #176
- add validation to pipeline/algorithms name #174
- add pipelines results stored/raw #172
- pipeline-driver queue support #171
- Add prometues cpu metric #169
- enternceTime #168
- Implement stop in algorithm-queue #164
- When we have triggers of pipelines we get very long jobId which cause error in minio #161
- Fix pipeline-driver-queue issue #160
- Change the resource manager to support the pipeline driver queue #152
- Create pipeline driver queue #151
- Add pre-commit script to package.json #150
- Validate
algorithmName
in nodes pipeline and exists in store #149 - Need to think about required cpu vs actual cpu #145
- Scale api-server #143
- Add unit parsing to /algorithms API #140
- Handle algorithm deletion #138
- S3 Buckets limit #137
- Add API for algorithms store #136
- Katacoda #135
- Add WS adapter #134
- ETCD cleanup & storing results on custom storage #131
- Add trigger tracking #130
- Landing Page #127
- api server should strore input in custom storage #126
- resource executor meeting status #125
- Resource Allocation #124
- View statistics of algorithm's pending requests in a graph view #112
- View statistics of pipeline's pending requests in a graph view #104
- NFRs Version 100 #85
- NFRs ver. Beta1 #79
- Receiving 8 pipeline requests within an hour #78
- Concurrent execution of 4 pipeline requests #77
- Pipeline request with a batch of 1,000 algorithms #76
- HKube execution overheads #75
- High Availability for algorithm services #74
- High Availability for HKube infrastructure services - version Beta1 #73
- Pipelines' execution management #66
- Retry algorithm execution, if system resource error (e.g. VM/ Docker failure) occurred during the algorithm execution #65
- Cancel pipeline execution, if the maximum retries for execution of failed algorithm exceeded #62
- production jaeger deployment #37
- The name of the pipeline will be its identifier #35
- Part or all of the inputs of algorithm are taken from the pipeline's request [R3] [R6] #34
- Batch size is determined by the pipeline's input #30
- Define algorithms which will be executed in the beginning of the pipeline [R8] #27
- api server should return data instead of custom-storage link #24
- data to custom storage should support stream #23
- stop pipeline sometimes don't work #9
- validation on flowInput on store pipeline #7
- results responses comes with jobid object #6
- cancel completed PL. #4
- each time a PL is finished, the results api should be updated #3
* This Changelog was automatically generated by github_changelog_generator