Skip to content

Latest commit

 

History

History
1235 lines (998 loc) · 92.2 KB

CHANGELOG.md

File metadata and controls

1235 lines (998 loc) · 92.2 KB

Changelog

[v2.3.19] (2022-04-03)

Notable changes:

  • UI - show queued job and manage preferred queue using drag and drop #1476
  • Stability improvements - recover from redis, etcd, mongodb disconnections and crashes #1433

Fixed bugs:

  • trigger-service ETCD watch disconnected after ETCD failover #1558
  • building algorithm - an algorithm version was not created after a successful build #1555
  • pipeline driver queue - recovery when all jobs are in Preferred queue #1547
  • debug: debugged algorithm error is not reported #1474
  • possible inconsistency upon algorithm queue crashes #1382
  • java code api does not work #1358
  • Ingress does not work in kubernetes 1.22 #1354
  • Swagger api /pipeline/results and /pipeline/status limit their responces to 1000 jobs #1190
  • UI - Gateway node should have default "stateType": "stateful" #1562
  • UI - datasource copy button version/snapshot generate wrong json #1404
  • Recover from Redis disconnection #1433

Closed issues:

  • User ordering for pending jobs - pipeline-driver-queue #1449
  • Make algorithm loading errors visible in dashboard #1168
  • Add storageClass placeholders in helm chart #1557
  • Add active dashboard #1512
  • Add specific TTL for active pipelines #1485

Merged pull requests:

[v2.2.47] (2022-03-01)

Notable changes:

  • Add output node type. Allows to collect output from multiple nodes in the pipeline
  • Add ability to prioritize queued jobs by jobIds, pipeline name and tags #1449
  • Usability updated to UI

Merged pull requests:

Fixed bugs:

  • when stopping running pipeline the job stay active in the pipeline driver #1534
  • fail to execute pipeline with output in the second time #1531
  • pipeline driver does not scale-up #1529
  • Algorithm queue - when the algorithm starts for the first time after creating an error is displayed #1443
  • Rebuild failed Error: unable to find build ->... #1412
  • Streaming Gateway send message issues #1376
  • return no tags in managed tag aggregation #1526 (golanha)
  • algorithm operator version #1520 (golanha)
  • Fail to build algorithm with "baseImage":"python:3.7-slim" #1522
  • when creating Tensor board optuna board board is created with the same path #1518
  • fail to execute node with input object that contains flowinput and batch on previews node results #1365
  • pipe in pipe request are block by API server #1521
  • No resources for buildConfig #1515
  • Cannot disable datasource in worker when datasource PVC is not available #1455
  • algorithm-operator does not set pipeline-driver resources #1453
  • Pipeline wizard -in the JSON streaming "flows" should be under "streaming" #1397
  • if an algorithm is in a terminating state it should be ignored #1348
  • pipeline wizard #1336
  • DVC client fails to move files #1299
  • webbrowser crashes viewing dashboard after heavy run #1183
  • API-server disconnected to the watcher #1129
  • streaming: stateful node memory leak when reconnecting to stateless node #1108
  • streaming - fail maintains high rage flow over time #1064
  • Streaming stateless NodeJS . pods that were stooped does not restart #1252
  • pipeline "progressVerbosityLevel" does not change the webhook verbosity level #1463
  • Caching and debug node buttons should be disabled in the streaming pipeline #1335

Closed issues:

  • Allow user to specify debug nodes in pipeline execution #1324
  • allow multiple pipelines with output #1533 (yehiyam)
  • Add specific TTL for active pipelines to v1.3 #1482
  • Create output kind #1448
  • Job search API #1337
  • Algorithm Queue auto-removal #1281
  • Add ability to filter out pending jobs (part feature, part bug) #1186
  • Pipeline with conditional algorithms (if statements) #1173
  • Resource consumption optimizations #966
  • algorithm nullable properties #1271
  • streaming #585

Implemented enhancements:

  • FlowInput Size Issue #1303

v2.1.93(2021-11-7)

major featuers

  • Gateway API - for streaming so u will able to create a rest API to a streaming channel directly from hkube routing #1233
  • Debugging improvements - no need to create a debug algorithm. Every algorithm can be debugged by the user by just setting a flag in the pipeline spec design

Pull Request

  • update logger #1431
  • add throttle to k8s error #1422
  • feat: add builds clean #1418
  • add extra registries #1364
  • feat: add errors log info #1417
  • disable enableCheckStalledJobs in worker #1416
  • fix algorithm_queue_scale #1415
  • feat: fix rebuild issue #1414
  • queues snapshot #1387
  • Streaming gateway #1413
  • algorithm-queue scale improvement #1405
  • update nodejs wrapper to 2.0.50 #1409
  • Pipeline status bug #1408
  • update logger and config for fluentbit #1403
  • feat: improve drivers scale #1395
  • update core/algorithm-builder/environments/python/wrapper/requirements.txt to hkube-python-wrapper==2.2.0.dev9 #1394
  • update core/algorithm-builder/environments/python/wrapper/requirements.txt to hkube-python-wrapper==2.2.0.dev7 #1393
  • clean openshift build #1383
  • add emptyDir and configMap mounts #1379
  • update core/algorithm-builder/environments/java/wrapper/version.txt to v2.1.7-SNAPSHOT #1389
  • Support range java #1380
  • update parser to handle batch on object with flowInput #1378
  • don't retry for algorithm execution code api #1373
  • update java wrapper to v1.0.20-SNAPSHOT #1372
  • add more logs #1361
  • java version in one place #1366
  • fix ingress #1363
  • fix: reconciler #1360
  • feat: add queue active by timestamp #1353
  • feat: add search limit #1356
  • add ingress type #1355
  • 2.0 snapshot #1351
  • support log message from debug algorithm #1342
  • update python wrapper in builder to version 2.2.0.dev5 #1346
  • update nodejs wrapper to 2.0.45 #1345
  • debug hkube api #1316
  • fix algorithm container status is undefined message #1333
  • feat: add rerun #1331
  • update python wrapper in builder to version 2.2.0.dev3 #1332
  • update units-converter to allow byte values #1330
  • handle raw run of cached #1326
  • add debugOverride #1321
  • add maxWorkers to algorithm spec #1322
  • feat: algorithm-queue scaled #1320
  • feat: algorithm-queue scale #1318
  • Swaggers merge #1310
  • trigger consumer job when reconnected #1317
  • add pipeline type for caching debug #1313
  • exec algorithm as debug #1309
  • feat: streaming flow validation #1306
  • fix getting algorithm template #1307
  • algorithm-debug #1264
  • pipeline-driver-queue latency issue #1301
  • pin dvc version #1302
  • fix empty docker creds on openshift 3.11 #1292
  • Rest Refactoring #1291
  • feat: improve job stop/pause/result #1290
  • Hkube gc #1287
  • add bodySizeLimit #1288
  • feat: move caching-service to api-server #1284
  • feat: ceil drivers amount #1283
  • feat: jobs search #1277
  • feat: driver tests #1280
  • Pipeline drivers scale #1259
  • feat: worker bugs #1279
  • update nodejs wrapper to 2.0.41 #1278
  • feat: check algorithm container status #1275
  • feat: resolve createIndices #1273
  • feat: resolve edges #1272
  • feat: delete algorithm nullable properties #1268
  • feat: adjust envs #1262
  • feat: add close method #1270
  • feat: reservedMemory parse int #1265
  • feat: driver_tests #1266
  • feat: dataSource -> spec #1261
  • add healthcheks #1263
  • Algorithm queue balancer #1251
  • Streaming gateway #1234
  • feat: add labels and annotations #1228
  • update kubernetes client package to fix k3s issue #1257
  • Concurrent pipelines #1250
  • use later nodejswrapper c02659e
  • Revert "Support range" 4635b04
  • Support range 7907765

v2.0.146(2021-04-28)

major featuers

  • Streaming pipelines - Add the ability to create pipelines that run continuously and pass the data along the flow #263
  • Data resource - This feature will enable managing data sources and applying them to pipelines on top of dvc #965
  • Redesign Data transportaion - moving data transportaion from object storage to ZMQ #771

Pull Request

  • fix dependency_install_cmd openshift #1245
  • fix: pipeline-driver-queue concurrent pipelines #1244
  • remove zip file after download complete #1242
  • converted tmp root dir to absolute path #1240
  • updated hkube/db version on all services #1237
  • Fix error message #1235
  • java wrapper version 2.0.00 #1232
  • update python wrapper in builder to version 2.1.0 #1230
  • fix no git user #1229
  • added validation for uploaded meta without data file #1227
  • update nodejs wrapper to 2.0.40 #1226
  • updated dependencies #1225
  • Convert to absolute paths #1224
  • update nodejs wrapper to 2.0.39 #1223
  • update nodejs wrapper to 2.0.38 #1222
  • added dvcignore content on create datasource #1220
  • handle no data on webhook #1219
  • fix input display in ui for batch results #1217
  • Fix running snapshot #1216
  • feat: node wrapper version #1213
  • datasource-files-path #1214
  • Bugfix subdirs #1212
  • update python wrapper in builder to version 2.1.0.dev37 #1211
  • feat: logger version #1209
  • feat: add total queue size #1208
  • update-db #1207
  • feat: add logger #1205
  • use workerImage if available #1135
  • feat: change scaler config #1195
  • increase list from api to 100000 #1200
  • ignore-undefined #1203
  • use new graph store #1198
  • feat: add reservedMemory #1197
  • use reservedMemory for allocation #1196
  • update python wrapper in builder to version 2.1.0.dev34 #1194
  • support storageInput for code-api #1193
  • added support for updating credentials #1192
  • update python wrapper in builder to version 2.1.0.dev33 #1189
  • Streaming metrics #1187
  • fixed using external github repository #1185
  • Bugfix snapshot #1184
  • fix: issues #1182
  • Refactor external datasource #1181
  • get results in webhook handler #1175
  • update python wrapper in builder to version 2.1.0.dev32 #1180
  • fix: enhance scaler #1178
  • update python wrapper in builder to version 2.1.0.dev31 #1179
  • added config.template #1169
  • Auto scaler #1170
  • update python wrapper in builder to version 2.1.0.dev30 #1164
  • update python wrapper in builder to version 2.1.0.dev28 #1161
  • update python wrapper in builder to version 2.1.0.dev27 #1159
  • feat: add streaming_roundtrip_metric #1158
  • update python wrapper in builder to version 2.1.0.dev26 #1157
  • Support internal kind #1148
  • fix-settings #1156
  • feat: improve scaling metrics #1155
  • feat: improve durations scaling #1154
  • accurate scale #1145
  • update python wrapper in builder to version 2.1.0.dev25 #1153
  • feat: DAG #1152
  • update python wrapper in builder to version 2.1.0.dev24 #1151
  • update python wrapper in builder to version 2.1.0.dev22 #1147
  • update python wrapper in builder to version 2.1.0.dev21 #1146
  • Datasource failure cleanup #1141
  • update python wrapper in builder to version 2.1.0.dev19 #1142
  • Streaming auto scaler #1130
  • Bugfix datasources #1140
  • add datasource envs and mounts to debug worker #1139
  • update python wrapper in builder to version 2.1.0.dev18 #1138
  • Datasource mounted cleanup #1137
  • implemented delete from git and storage #1115
  • Support datasource external storage #1134
  • add optional api-server prefix #1126
  • clean stopped jobs #1131
  • Cli endpoint #1121
  • Stored pipeline merge #1128
  • fix configmap algorithm operator #1127
  • throw if no hkube configmap #1123
  • update python wrapper in builder to version 2.1.0.dev17 #1117
  • fix component #1116
  • update python wrapper in builder to version 2.1.0.dev16 #1113
  • Datasource #996
  • add stop to isDiscovery #1111
  • feat: streaming retry #1107
  • update python wrapper in builder to version 2.1.0.dev14 #1105
  • feat: handle retry on streaming #1102
  • execute copy on docker #1103
  • use correct project for openshift build #1092
  • update python wrapper in builder to version 2.1.0.dev13 #1099
  • adding stopping command #1097
  • feat: improve worker logs #1090
  • fix: improve disallow changing algorithm type #1094
  • feat: improve worker logging-proxy #1087
  • update python wrapper in builder to version 2.1.0.dev12 #1086
  • upgrade nodejs wrppaer to 28 #1085
  • algorithm-deps-install #1081
  • update python wrapper in builder to version 2.1.0.dev11 #1082
  • update python wrapper in builder to version 2.1.0.dev10 #1079
  • feat: enable keys with dots! #1077
  • feat: ignore statful scale down #1073
  • feat: enable big data debug #1071
  • fill missing data that was too large for debug algorithm #1069
  • feat: fix trigger and boards types #1068
  • use settings.xml at build #1057
  • pass STORAGE_RESULTS_THRESHOLD to pd #1066
  • update python wrapper in builder to version 2.1.0.dev9 #1065
  • feat: add info about node input #1060
  • feat: fix run node twice #1062
  • fix ignoring debug algorithm #1061
  • improve scale up/down #1054
  • remove m2 dir creation in java builder #1049
  • db migration #1039
  • ignore .pyc files #1050
  • use internal artifactory at java build .... bump version [skip ci] #1040
  • use internal artifactory at java build #1040
  • update python wrapper in builder to version 2.1.0.dev8 .... bump version [skip ci] #1047
  • update python wrapper in builder to version 2.1.0.dev8 #1047
  • run npm ci inside docker #1046
  • feat: minor bugs #1044
  • fix nodejs node_modules #1041
  • update python wrapper in builder to version 2.1.0.dev7 .... bump version [skip ci] #1038
  • update python wrapper in builder to version 2.1.0.dev7 #1038
  • ci-try #1033
  • feat: add buildId to version after successful build #1029
  • fix: : Cannot set property 'error' of null, remove custom stream child #1028
  • update nodejs wrapper to 2.0.25 #1026
  • feat: delete all .travis.yml #1027
  • add stream flow, improve scale up/down #1023
  • Streaming custom flow #1021
  • feat: trigger builds #1015
  • update python wrapper in builder to version 2.1.0.dev2 #1020
  • update python wrapper in builder to version 2.1.0.dev1 #1019
  • add support for cyclic graph, streaming custom flow #1010
  • V2 streaming #892
  • add pullSecret for openshift builds #1005
  • update python wrapper in builder to version 2.0.30 #1007
  • 2.0-SNAPSHOT #1000
  • fix: bug with cache nodes #1003
  • fix remove file on apply, fix update algorithm #999
  • feat: algorithm guarantee quota #988
  • feat: make versions work with id #992
  • support java build in disconnected environment #995
  • fix wrong validation of algorithmEnv #994
  • feat: add pipelineKind schema 562013a
  • Github actions CI 57b600b
  • feat: update CHANGELOG.md 7f55078

v1.3.147 (2020-10-20)

Fixed bugs:

  • Swagger api exec/algorithm does not accept any input that isn't an empty list #985
  • Fail to "Run Node" when data received from flowInput #969
  • Debug url does not contain ingress prefix #961
  • hkubectl Issues #881
  • stopping node that contain the ETCD LEADER cased the HKUBE to stop working #927
  • Support self signed repositories for kaniko build #427
  • python wrapper does not work with s3 #887
  • Pipeline driver queue got stuck after ETCD disconnection #942
  • deleted algorithm are not deleted from the ETCD /algorithms/queue/ #928
  • Replying for the "get from peer" request cause the algorithm to crash due to out of memory #921
  • download node result (if it contains more then 1 MB ) should perform the same as download pipeline result #910
  • Algorithm Queue uncaughtException: Watcher canceled: #894
  • Wrong monitor-server path in helm chart #945
  • Thing to solved for 1.5 #922
  • build algorithm new version change the algorithm entry point #880
  • "Run Node" on batch #919
  • when handling big size data and reaching the cluster memory limits the Task executor does not release idel pods #932
  • java algorithm code APIs should receive MAP or list and return MAP #889
  • Low TTL #863
  • UI crash when one user delete an object that other user view #856
  • 700MB input case the Resource Manager to stuck #920
  • Worker cash size casing the pod to terminate #904
  • Get data from Peer function when dealing with big data #903
  • pipeline with batch stuck with node "storing" #902
  • Debug worker and Algorithms should have same default discovery encoding #842
  • Missing timeout on debug worker client side of discovery communication #841
  • Unable to delete algorithm properties once added #792
  • Known UI Issues #525
  • API server miss validate node input #893
  • API server crash when the storage is too big #740
  • cannot create storage buckets when clusterName has a period in the name #655
  • Redis Storage filled up #790
  • hkubectl sync watch does not copy the source file to the algorithm mount storage #890
  • Java build logs size #855
  • Etcd issues #793
  • sometime algorithm does not get results from previous alg #885
  • Node worker fail to read batch result #886
  • GPU pipeline take all GPU resources none GPU pipeline are pending (evens that the none GPU resource are available ) #795
  • Field pipeline staid in "pending" (Due to fail communication to ETCD) #794
  • opengl algorithm does not work #780
  • Different priority Pipelines with Different algorithm #722
  • Tensorboard status not updated to running when running in a different namespace #865
  • API server does not read configuration from config file #797
  • Github example algorithm should be using the new wrapper #826
  • Can't install HKube on Kubernetes 1.16 #540
  • concurrent Pipelines works only within the experiment #746
  • Known issues #477
  • caching wait any - input equal null #852
  • in debug "hkubeapi.start_algorithm" and "start_stored_subpipeline" with "resultAsRaw=True" does not return results #830
  • this._nodes.getWaitAny is not a function #747
  • wait any + batch return different result each run #825
  • randomPipeStored pipeline got stוck once in few hours #779
  • authorization error when working with private repository in gitlab (works with github ) #840
  • Algorithm Close abnormal #604
  • Duplicated sub-piplines #603
  • when node has more then one inputs and one of them is "flowInput=null" the node is being skipped #832
  • pause and resume batch on batch pipeline #823
  • fail to build gitlab algorithm #822
  • pipeline driver out of memory #809
  • run node (execute caching) on a batch #781
  • validate pipeline input when input equal null #737
  • sometime worker complete its job but does not change its status to done #718
  • worker displayed as active when pipeline failed #686
  • Graph is Unreadable in some cases #656
  • github build ignor the commitid #770
  • Experiment Issues #744
  • delete "main" experiment cause the dashboard to crash #778
  • invalid task status exit #709
  • subpipeline tracing fix #591
  • Experiment name validation #791
  • GPU - clean unused worker #783

Closed issues:

  • Changing attributes or image of an algorithm should restart any of the algorithm pod where worker is in ready state. #319
  • Add support for ImagePullSecrets #924
  • Algorithm definition should include the output mime type (jpg, json,txt .....) #877
  • while algorithm is serving other algorithm "get from peer" request it should not go down #911
  • Improve travis yml #872
  • Debug disconnection #845
  • Check Node 14 - and upgrade #835
  • if the necessary data can be found in the worker or the cache or the get from peer fail once . the "get from peer" should not be executed #912
  • Errors in Worker #848
  • debug and run existing algorithm with single click #705
  • add --wait to hkubectl pipeline run #434
  • Download result #815
  • Missing useful batch operations for multiple batch parameters #660
  • Job Graph: add sub-pipelines and code-algorithms links #750
  • If pipeline was unable to be executed due to resource problem the rezone should be displayed #803
  • shorten job and task id #860
  • Java algorithm builder #789
  • Show tensorboard while algorithm is running. #681
  • Streaming #672
  • View analysis of algorithm using tensorflow , in Tensorboard #610
  • the fail logic in streaming pipeline should act differently so hkube will create a new algorithm container on each crash #590
  • the batch phases in the streaming pipeline should increase and decrease its instances according to the pipeline workload #589
  • the streaming layer should have an abstraction layer so it will be possibly to switch between streaming framework seamlessly #588
  • Pipeline repository management #72
  • Caching Service #814
  • Code-API catch WS events properly #851
  • Add type to a pipeline that in debug/dev mode #843
  • Task Level HW Resource Requirements #749
  • update the hkubectl #628
  • Etcd Tuning #15
  • Stop build #834
  • delete pipeline should stop all active and pending pipeline #820
  • add kibana and grafana dashboards automatically on HKube deployment #666
  • missing pagination for batched algorithm #600
  • Python Wrapper: get access to pipeline's flowInput #663
  • FlowInput #811
  • algorithm with no github commit #819
  • Code API #513
  • Trigger execution flowInput is incorrect #679
  • Algorithm version is not specified in dashboard "Worker" details #658
  • Missing JOB ID for batch algorithm in Node Information-> Input output Details #599
  • Etcd UI short timeout #729
  • Missing easy access to tasks logs after worker is down #661
  • Automatically create new algorithm if image changes #385
  • thinking off the ability for auto build algorithm after github hooks #372
  • add the ability to build algorithm from url #371
  • Allow runing a pipeline in reoccurring mode, starting again when pipeline ends. #343
  • Add support for builds on OpenShift #755
  • Support for OpenGL algorithms #701

Merged pull requests:

v1.2.186 (2020-04-30)

Implemented enhancements:

Fixed bugs:

  • Fail algorithm build need to remove the Stop button #774
  • Build algorithm fail from time to time #769
  • Partial gpu support #772 (yehiyam)
  • Allow user to set node graph direction #642
  • triggered pipeline input data #756
  • cron pipeline priority default overwrite the stored #754
  • GUI - create pipeline need to change the concurrent Pipelines to object #742
  • Download pipeline result file contain binary data #734
  • build algorithm from github long name #713
  • Code API "Node information" does not displayed #625
  • Fail to execute sub-pipline #735
  • inconsistent flowInput validation #725
  • workers of a batch in a failed pipeline still running #638
  • No errors in dashboard when triggers fail #739
  • disable Hkube metadata #728
  • resume multiple batch pipeline #719
  • pipeline execution input overwritten by pipeline original inut #716
  • github private repository need to add username #715
  • Github branch does not create new build #714
  • No taskId on refreshing node's logs. #702
  • pre schedule of algorithm that timed out and exited with non 0 code #639
  • Can't get result's indexed item #662
  • tensors board #687
  • Configure pipeline to call other pipeline(or sub pipe line) #605
  • Allow user to specify different npm repository for builds #387
  • Error applying algorithm defined from github #699
  • invalid task status exit #710
  • algorithm-builder ignores baseImage property in git mode #695
  • No pipeline stats after experiments feature #693
  • Add Trigger type to pipelines Types #691
  • Logs fetched twice #689
  • add form data to components in the UI #582

v1.2.132 (2020-01-28)

Fixed bugs:

  • subpipeline not shown in dashboard (no experiment prefix) #680
  • Graph nodes color is sometimes wrong #657

Implemented enhancements:

  • Support using arbitrary base images #563

Fixed bugs:

  • Batch on Batch #676
  • Algorithm retry overwrite batchTolerance #674
  • git webhook creates an empty version which cause wired behaviors #607
  • Nodes in pipeline graph are rendered in wrong order #358
  • workers dont get aborted when there is a requirement for another algorithm #201

Closed issues:

  • Missing option for complete pipeline result including non-final nodes #673
  • Add TTL for each Algorithm #601
  • Allow defining TTL in Algorithm/Node level #342

v1.2.127 (2020-01-23)

Closed issues:

  • Run algorithm without creating pipeline for it #647
  • Algorithm retry #602

v1.2.109 (2020-01-16)

Fixed bugs:

  • Algorithm version window display other algorithms version of similar algorithm names #624
  • input for an algorithm after skipped node #583

Closed issues:

  • copy worker ID to clipboard - from Node Information form #598

v1.2.108 (2020-01-15)

Closed issues:

  • ParentID for each Algorithm- feature request #597

v1.2.107 (2020-01-14)

Implemented enhancements:

  • Need to update the storage-cleaner #568

Fixed bugs:

  • no error message when trying to run non existing algorithm or pipeline with code api #510

Closed issues:

  • Delete related data after algorithm deletion #627
  • Cache storage data in worker #576
  • add simple API for storage #554
  • Add Graph API & Improve the pipeline graph #545
  • Expose API for gitlab and github webhook upon push #518
  • Expose through UI, building algorithm from code in git #517
  • Add pause/resume pipeline #344
  • Code API: execute command for creating dyanmic algorithm from worker #190

v1.2.103 (2020-01-13)

Fixed bugs:

  • Stop pipeline while running "results" status is completed #620
  • Algorithm with non-exist image #500
  • rerun pipeline from cached pipeline #326

Closed issues:

  • Dashboard API to change the logger verbosity level #609
  • We should add metrics and scores to algorithms #592
  • Put result to storage #580
  • invalid task status exit #548

v1.2.96 (2020-01-05)

Fixed bugs:

  • Update algorithm version while running force = false #648

v1.2.92 (2020-01-02)

Implemented enhancements:

  • Github issues cleanup #396

v1.2.79 (2019-12-25)

Implemented enhancements:

  • Create package for removing duplicate code #573

Closed issues:

  • There is no info on batch errors #284

v1.2.77 (2019-12-24)

Closed issues:

  • Need to think about common code between pipeline-driver and algorithm queue #166

v1.2.76 (2019-12-24)

Fixed bugs:

  • batch tasks are not canceled when pipeline fails #578
  • Jobs cleaner not working #532
  • not constant execId and pipleineId types in the worker on outgoing messages #521
  • Batch errors are not visible #366
  • algorithm queue writes CrashLoopBackOff every 15 seconds #239
  • Queue cant accept new job after job arrived with state stop #225

Closed issues:

  • Remove prefix raw- for raw pipeline #615
  • Add property "Tags" & "Types" for each pipeline execution #614
  • Split tests into files #520
  • Recovery after deploy #512
  • Versioning for algorithms. #487
  • Merge state with status #422
  • we should replace our wrapping implementation for algorithms to work as a package #347
  • Running time for each algorithm #292
  • algorithm builder: features implementation #289
  • Nodes and pipeline results #287
  • Task Executor Features #270

v1.2.72 (2019-12-17)

Closed issues:

  • Algorithm name length issue #259

v1.2.66 (2019-12-16)

Fixed bugs:

  • ENAMETOOLONG: name too long - due to subpipe of subpipe #562
  • Cannot run new algorithm after image update #509

Closed issues:

  • Update algorithm name #530
  • Allow stopping a worker from UI #321
  • Create hkube flow story with blog #306
  • Download source code #302
  • Large CPU algorithm will not be scheduled #233
  • When stopping pipeline need to clear data from etcd #200

v1.2.63 (2019-12-12)

Fixed bugs:

  • algorithm image name: "must not have leading or trailing whitespace" #581

Closed issues:

  • Fix the clean job api #569

v1.2.62 (2019-12-12)

Closed issues:

  • Improve the dashboard #544

v1.2.60 (2019-12-11)

Closed issues:

  • insert to storage on result #571

v1.2.58 (2019-12-08)

Closed issues:

  • Add Event entity #234

v1.2.57 (2019-12-02)

Fixed bugs:

  • algorithm versions list is copied to all open algorithms #593

Closed issues:

  • move examples/algorithm-example-python/ to ----> own repository #408

v1.2.56 (2019-11-27)

Implemented enhancements:

  • Direct call from dashboard to API-Server #570

v1.2.44 (2019-11-10)

Closed issues:

  • Move to husky? #483

v1.2.26 (2019-10-28)

Fixed bugs:

  • cannot stop code-api algorithm #501
  • out of memory in the pipeline driver #490
  • debug worker #431
  • Image Pull Backof #224

Closed issues:

  • check the bull removeOnFail #467
  • Preschedule: run child node before completion. #454
  • Moving to docker in docker instead of using machine's docker #368
  • Add option for algorithms to execute another algorithms using API #288
  • Update doc on hkube.io #191

v1.2.11 (2019-10-03)

Implemented enhancements:

  • Change the way jobs are scheduled #232

Fixed bugs:

  • no loading screen #495

Closed issues:

  • send last pipeline result to algorithm #269
  • create timer for bootstrap on algorithm #241
  • Check high availability of K8s masters #217
  • Reduce intervals for resource-manager and task-executor #205

v1.2.9 (2019-09-19)

Fixed bugs:

  • inconsistent datatypes return on code api execId and pipelineId #502

v1.2.2 (2019-09-04)

Implemented enhancements:

  • Rewrite etcd.hkube #184

Fixed bugs:

  • pipeline driver fails and go to recovering mode and no other driver takes the job #469
  • turning on a cron job for a pipeline #442
  • race condition in dashboard #430
  • create a hot worker and than delete it #278

v1.1.1084 (2019-08-28)

Fixed bugs:

  • Algorithm readme not updating #423

v1.1.1081 (2019-08-26)

Fixed bugs:

  • download pipeline results dosent work #468
  • stalled job #463
  • ui crashes if you click on a view before the data was loaded #460
  • sub pipeline from the algorunner dosent work #458
  • cpu usage when adding an algorithm #444
  • node input output details doesn't show #441
  • pipeline description #405
  • worker page pod additional details #402
  • deleting the pipeline-driver-queue when executing a new pipeline #290

Closed issues:

  • Refactor Graph #478
  • Add Dashboard Routing #451
  • Swagger split into files #298

v1.1.1080 (2019-08-21)

Closed issues:

  • we should verify that checksum operation for algorithm build is not heavy #488

v1.1.1072 (2019-08-18)

Closed issues:

  • Loading Screen #395
  • Infinity Table #394

v1.1.1068 (2019-08-13)

Fixed bugs:

  • dashboard crashes #473

Closed issues:

  • Add watch for stop build #453

v1.1.1061 (2019-08-07)

Fixed bugs:

  • building a node algorith, dosent work #457

v1.1.1058 (2019-08-07)

Fixed bugs:

  • building a node algorithm #464

v1.1.1046 (2019-08-04)

Fixed bugs:

  • Stop algorithm build #452

v1.1.1037 (2019-07-30)

Fixed bugs:

  • pipelines page #443

v1.1.1033 (2019-07-30)

Fixed bugs:

  • set ingress permission in cleaner role #432

v1.1.1031 (2019-07-29)

Fixed bugs:

  • all pages "show number of rows in a table" #407

v1.1.1030 (2019-07-28)

Fixed bugs:

  • fix docker image parsing #433
  • edit description. both on pipelines and algorithms #404
  • drivers page- sort #403

v1.1.1022 (2019-07-17)

Fixed bugs:

  • Store Algorithm Readme #393
  • python automated build ignores exceptions on ws.send #383
  • Fix auto complete #382
  • add new algorithm from build #308

Closed issues:

  • stalled in node lead to failed pipeline #412
  • json parser for algorithm from build #309
  • Pipeline Driver not deleting /jobs/tasks/<JobId> #277

v1.1.1021 (2019-07-08)

Closed issues:

v1.1.1016 (2019-06-30)

Closed issues:

  • update base image tag automatically #399

v1.1.1015 (2019-06-30)

Fixed bugs:

  • add/edit algorithm description #406
  • input output details on a worker #401
  • view pod logs form the workers tab #336
  • the UI is very slow #333
  • pipelines page- show pipeline details #327

v1.1.1014 (2019-06-30)

Fixed bugs:

  • Fix API CLI documentation #369

v1.1.1010 (2019-06-25)

Fixed bugs:

  • post request endpoints are not correct #337
  • update algorithm #335
  • running pipeline with input #332
  • creating a pipeline with the wizard #331
  • filter the jobs table by some key then run "simple" pipeline- ui craches #329

v1.1.1009 (2019-06-24)

Merged pull requests:

v1.1.1007 (2019-06-23)

Closed issues:

  • Restore missing features #381

v1.1.1003 (2019-06-19)

Merged pull requests:

v1.1.1001 (2019-06-18)

Fixed bugs:

  • workers page open pod details crashes the page #334
  • create pipeline in the ui doesn't show nodes to the pipeline json #330
  • jobs page - Unnecessary filter icon at the Pipeline status column #328
  • Fix issue with pipeline rerun #312

Closed issues:

  • Switch to WS as default #325
  • Add memory human readable property #311
  • Dashboard Refactor #310
  • Fix issues with env: entry-point and worker-protocol for debug #307
  • reduce node_modules size when building docker by: npm i --production #285

Merged pull requests:

v1.1.1 (2019-06-04)

Implemented enhancements:

  • Add a tool-tip to icons in main menu bar when menu is collapsed #315
  • Move to Node v10 #210
  • Add rate-limiter to api-server #21

Fixed bugs:

  • Problem in selected Algorithm logs tab, in pipeline graph view. #313
  • worker that got a stopped job dosent clear well #296
  • simulator get stuck only monitor server restart helps for not long #295
  • error key in json was added even when no error is the pipeline #286
  • trigger service will not exit if unable to reconnect to ETCD #275
  • pipeline should report finish only if result was successfully saved to storage and not before. #257
  • driver should report error if process failed due to memory usage more than 256MB #256
  • reconciler should kill redundant pipeline-drivers if defined less than actual running drivers #255
  • a pipeline that worked stopped working and fails after crashLoopBackOff #243
  • add empty LocationConstraint (production bug) #240
  • CrashLoopBackOff when algorithm exit and pipeline stopped #238
  • sometimes storgeInfo not exists #237
  • delete algorithm after all pipelines completed and pipeline was deleted is not working #235
  • task executor sometimes request for double the workers it really needs #227
  • resource allocation for a pod is not as described #220
  • failed to start pipeline if "input" was missing from pipeline descriptor #216
  • last algorithm in a batch is taking to much time to complete #215
  • ttl reason for exparation time should be in a real time not epoch #212
  • override stored pipeline options #209
  • algorithm operator stuck when etcd is failed to connect #206
  • unable to find handler for job #204
  • transition is invalid in current state #203
  • Cannot read property 'filter' of undefined #202
  • Pipeline that failed get stopped status instead of failed #198
  • delete an algorithm is not possible #197
  • resource manager request more resources than aviliable #196
  • algorithm that failed and threw an error stays up in the system and not getting cleaned/deleted #193
  • stop pipeline doesn't work #192
  • algorithm queue scores was not ordered when new job added during scoring update #188
  • wait any return wrong number of results. #187
  • failed algorithm return null response in the results #177
  • execute a request and immediately stop it, the request remain in status "sttoping" #175
  • Make triggers pipeline work the other way #173
  • pipeline result is null/empty #167
  • Fix AlgorithmQueue active count #163
  • logs from the algorithm are sometimes dont written to the elastic search logs #148
  • pod stopped working, show an error that need to be tracked #147
  • max retries of an algorithm should fail the pipeline #146
  • large batch size doesnt run #144
  • Etcd periodic cleaner #142
  • Pipeline request using pipeline which isn't stored in the repository of pipeline descriptors #43
  • When stopping pipeline while worker is init #40
  • worker tries to fetch null data from storage provider #39
  • Run batch on node results not working due to new storage feature #38
  • apak algo dont deal with stop #26
  • when stopping an apak pipeline the amount of watchers is increasing #25
  • exec stored with a large amount of rows in the body of the request returns a 413 error "payload to large" #20
  • batch tolerance 0% sets the default to 80 #19
  • inconsistency batch tolerance behaviour #18
  • max retry on failing pods should return an error and close pipeline #17
  • logs are taken very long to be written to ES #14
  • pipelines that sometimes do not end the run (inconsistent) #13
  • results of a pipeline comes with a big delay #12
  • pending pipelines that were stopped before starting never being stopped #11
  • get status of a pipeline after some nodes has finished #10
  • worker crashes after timeout between the worker and the algorithm #8
  • the '0' result or false result are not returned to the api-server #5
  • inconsistent end status received after PL completed #2

Closed issues:

  • Clean docker trash after build #303
  • Add stream support in FS storage adapter #301
  • GPU resource management seems to no work well #297
  • bucket name should include cluster name also to be able support multiple clients from different clusters #294
  • Need to re-enter tasks to queue after stalled #260
  • Simplify API for adding algorithm to HKUBE #250
  • Add description field to pipeline #245
  • reduce logs on file watch error #244
  • test issue #242
  • validate exec raw run without trigger #231
  • resource manager request for more resources the cluster have, it results in workers stuck on pending status and dosent exit alnoe #230
  • batch on empty array not working #229
  • validate deep calls of trigger pipelines not exceed the limitation #228
  • Check why pipeline-driver has retry job #222
  • Insufficient cpu for pipeline drivers #219
  • Check if exec/tree still works #211
  • stop pipeline if expired #186 #195 #208
  • unable to create job with capital letter #199
  • Add expiration time #195
  • Add support for /internal/sub-pipelines & /internal/stop #194
  • Review ETCD tree architecture #189
  • Add delete execution after job completed #186
  • Add watch state for all jobs in etcd #183
  • Create common k8s pods npm package #182
  • storage cleaner #181
  • Implement stop in pipeline-driver-queue #180
  • etcd issues when saving large data #179
  • When pipeline failed, need to send stop to all workers (batch) #178
  • change the pipeline name on /exec/raw #176
  • add validation to pipeline/algorithms name #174
  • add pipelines results stored/raw #172
  • pipeline-driver queue support #171
  • Add prometues cpu metric #169
  • enternceTime #168
  • Implement stop in algorithm-queue #164
  • When we have triggers of pipelines we get very long jobId which cause error in minio #161
  • Fix pipeline-driver-queue issue #160
  • Change the resource manager to support the pipeline driver queue #152
  • Create pipeline driver queue #151
  • Add pre-commit script to package.json #150
  • Validate algorithmName in nodes pipeline and exists in store #149
  • Need to think about required cpu vs actual cpu #145
  • Scale api-server #143
  • Add unit parsing to /algorithms API #140
  • Handle algorithm deletion #138
  • S3 Buckets limit #137
  • Add API for algorithms store #136
  • Katacoda #135
  • Add WS adapter #134
  • ETCD cleanup & storing results on custom storage #131
  • Add trigger tracking #130
  • Landing Page #127
  • api server should strore input in custom storage #126
  • resource executor meeting status #125
  • Resource Allocation #124
  • View statistics of algorithm's pending requests in a graph view #112
  • View statistics of pipeline's pending requests in a graph view #104
  • NFRs Version 100 #85
  • NFRs ver. Beta1 #79
  • Receiving 8 pipeline requests within an hour #78
  • Concurrent execution of 4 pipeline requests #77
  • Pipeline request with a batch of 1,000 algorithms #76
  • HKube execution overheads #75
  • High Availability for algorithm services #74
  • High Availability for HKube infrastructure services - version Beta1 #73
  • Pipelines' execution management #66
  • Retry algorithm execution, if system resource error (e.g. VM/ Docker failure) occurred during the algorithm execution #65
  • Cancel pipeline execution, if the maximum retries for execution of failed algorithm exceeded #62
  • production jaeger deployment #37
  • The name of the pipeline will be its identifier #35
  • Part or all of the inputs of algorithm are taken from the pipeline's request [R3] [R6] #34
  • Batch size is determined by the pipeline's input #30
  • Define algorithms which will be executed in the beginning of the pipeline [R8] #27
  • api server should return data instead of custom-storage link #24
  • data to custom storage should support stream #23
  • stop pipeline sometimes don't work #9
  • validation on flowInput on store pipeline #7
  • results responses comes with jobid object #6
  • cancel completed PL. #4
  • each time a PL is finished, the results api should be updated #3

* This Changelog was automatically generated by github_changelog_generator