Releases: jina-ai/serve
🎉 Release v0.9.2
We are excited to release Jina 0.9.2. Jina is the easier way to do neural search in the cloud. Highlights of this release include:
- Support for delete/update operations
- Add native AsyncIO support and unlock native support for running Jina in Jupyter notebooks and IPython
- Add
MultiModalDocument
as primitive types to support multimodal search in a Pythonic way - Refactor
Pea
and introduceRuntime
to improve code readability and maintainability
Release 0.9.2
⬆️ Major Features and Improvements
Completeness
-
To support for updating and deletion operations, we introduced
update
anddelete
method forBaseKVIndexer
andBaseVectorIndexer
;update
anddelete
APIs are introduced in Flow APIs. #1380, #1415, #1550, #1460 -
Refactoring to native asyncio. This unlocks support for running Jina in Jupyter notebooks and IPython.
AsyncClient
andAsyncFlow
were added to let users manageeventloop
and make Jina more reliable in Jupyter notebooks and IPython #1348, #1408, #1410, #1428, #1450, #1453, #1463, #1562
Click to see example
from jina import AsyncFlow
with AsyncFlow().add(uses='_logforward') as f:
await f.index_lines(lines=['hello', 'jina'], on_done=print)
Ease of Use
- Add
MultiModalDocument
as primitive type. This lets users build a multimodal search system in a Pythonic way. #1335, #1385, #1390, #1368, #1395, #1399, #1401
Click here for example code
from jina import Document, MultimodalDocument
chunk_img = Document(modality='dummy_image', embedding=np.random.rand(1, 4))
chunk_text = Document(modality='dummy_text', embedding=np.random.rand(1, 10))
multimodal_doc = MultimodalDocument(chunks=[chunk_img, chunk_text])
- Introduce
Runtime
as a member ofPea
, defined as "a procedure that blocks the main process once running, therefore must be put into a separated thread/process. The new architecture greatly improves the readability and maintainability of the code. #1426, #1473, #1487, #1539, #1577
⚠️ Breaking Changes
- Introduce
UniqueId
,ChunkSet
,DocumentSet
,MatchSet
; Removeadd_chunk
andadd_match
; RefactorDocument
with newly introduced classes. #1343
Click here for example code
0.8.0 | 0.9.2 |
from jina import Document()
with Document() as d:
c = Document(id=f'1:0>16')
d.chunks.append(c)
with Document() as d:
c = d.chunks.append()
c.id = f'1:0>16' |
from jina import Document()
with Document() as d:
c = d.chunks.add_chunks()
c.id = f'1:0>16' |
-
Refactor YAML file parsing backend from
ruamel.yaml
topyyaml
and introducejina.jaml
for parsing YAML files. The dependency onruamel.yaml
is deprecated. #1495, #1516, #1524, #1533, #1547, #1581 -
Add
_merge_matches
and_merge_chunks
for merging messages in different ways. Remove_merge_all
. #1406 #1418 -
PyClient
renamed toClient
for simplicity #1450
📗 Documentation
- Update Korean Readme #1364 @doomdabo
- Add code review guide #1397
- Fix typos in
helloworld.html
. #1405 @harry-stark - Add documentation for recursive data structure. #1394
- Fix redundant translation in Chinese Readme #1443 @smy0428
- Fix missing CLI content. #1481
- Fix typos in
README.md
. #1500 @Kavan72 - Improve docs.jina.ai. #1513, #1514, #1586
- Improve
TorchDevice
docstring #1499 @tadej-redstone - Fix typos in Russian Readme #1544, #1572 @git-webmaster
- Fix typos in CLI interface #1578 @xinbinhuang
- Add Spanish Readme #1579 @PabloRN
🐞 Bug Fixes and Other Changes
Flow
- Fix issue terminating
RemotePea
#133 - Refactor Pea closing logic #1379, #1398, #1457
- Refactor
peapods
code base #1421 - Add versioning for Flow YAML config files. Introduce
method
field for Flow YAML configurations. #1442 - Add
env
filed for Flow and Pod YAML configuration so that shared environment variables can be set. #1446, #1448 - Rename Flow
output
argument toon_done
. #1476 - Fix client
top_k
malfunctioning bug. #1522 - Add
return_list
option for Flow API and introduceResponse
as new primitive type. Whenreturn_list=True
, return results are a list ofResponse
objects to make it easy to interpret. #1541 - Fix CORS behavior bug for REST API #1568 @yk
Executors
- Change default metric of
NumpyIndexer
to cosine #1393 - Remove deprecated
jina/executors/encoders/helper.py
#1563 @tadejsv - Introduce
batching_multi_input
decorator to add batching support for rankers #1467 @deepampatel - Allow Indexers to have separate workspaces. #1383
- Fix bug when shards are empty #1340, #1396
Drivers
- Add
op_name
forMatches2DocRankDriver
#1409 - Add
batch_size
argument forEncodeDriver
to enable batching on driver level #1483 - Make DocIdCache capable of detecting collisions on content level #1510
- Enable
AggregateMatches2DocRankDriver
for keeping chunks of matches #1494
Types
- Add
NamedScore
as new primitive type. #1430 - Support
+
and+=
operations forDocument
. #1555 - Move
extract_content()
toDocumentSet
. Instead of usingdocs = DocumentSet(random_docs(2)); extract_content(docs)
,docs.all_contents()
makes it easier to get contents from a set ofDocument
s. #1387 - Refactor
random_id
and introducecontent_hash
field inDocument
. #1440
Tests
- Improve unit tests for
test_hello_world
#1305 - Refactor unit tests for
queryset
#1336 - Refactor unit tests for evaluation #1339
- Refactor unit tests for index remote #1346
- Fix integration tests for
jinad
#1367, #1388, #1407 - Refactor
random_docs()
in unit tests #1356 - Add unit tests for convert functions in
Document
#1389 - Fix callbacks in unit tests.
callback
failures had chance of being not captured by tests #1391 - Fix integration tests for evaluation #1411
- Refactor doctrings in unit tests of
QueryLangSet
#1417 - Fix bug failing to capture errors of callbacks during unit tests. #1419, #1536
- Refactor unit tests for
types
#1435 - Refactor unit tests for
request
#1445 - Add unit tests for corner cases in calculating similarity metrics #1434
- Add evaluation option for hello-world #1465, #1488, #1508, #1501,
- Add test for loading customized drivers #1474
- Refactor unit test for
drivers
#1452 - Set default value of
eval_at
inPrecisionEvaluator
andRecallEvaluator
toNone
#1552 - Fix unit tests of
test_hub_usage
whenGITHUB_TOKEN
is used. #1560 - Refactor unit tests for
drivers
#1559 - Refactor unit tests in
hubio
to useBuildTestLevel
#1361 - Fix naming for
test_rankingevaluation_driver
#1573
HubIO
- Fix Jina Hub automated updates and add GA for updating Jina Hub images. Check out more details at hub-updater #1298, #1345, #1360, #1456
- Redefine naming convention of Docker images in Jina Hub. Naming follows
{repository}/{type}.{kind}.{name}:{version}-{jina_version}
#1341 - Avoid overwriting Docker image in Jina Hub when tag already exists. #1365
- Clean up
hubio
imports. #1381 - Fix
hubio
version checking and add--no-overwrite
option forjina hub --push
#1403 - Fix
hubio
test levels #1361 - Add
--timeout-ready
option forhubio
#1525 - Fix typo in error message #1531
- Fix access to token credential file for
jina hub push
#1492 - Switch to
hubapi
for retrieving Docker login information #1429, #1589
Others
- Adapt to new remote log APIs #1300
- Adapt to Docker SDK 4.4.0 in
ContainerPea
#1334 - Move log parser from
jinad
to core. #1342 - Use
load_config
directly as a classmethod #1352, #1354 - Fix bug during completing file path for errors #1353
- Fix top-k setting bug #1359
- Fix newlines for autocompletion in bash. #1425 @lsgrep
- Fix latency check during CI #1437
- Add client-side exception handlers #1458, #1462,
- Add GA for automated comments on lint failures. #1486, #1507, #1519
- Introduce
ArgNamespace
injina.helper
to manage all namespace-related operations #1489 - Introduce training. #1518
- Introduce
jina.jaml
for parsing YAML files. #1533, #1547, #1581 - Fix bug in parsing config source files #1583
🙏 Thanks to our Contributors
This release contains contributions from Amritpal Singh, Bithiah Yuan, CatStark, Deepam Patel, Deepankar Mahapatro, Florian Hönicke, Han Xiao, Harry Stark, Hidan, Joan Fontanals, Nan Wang, Pratik Bhavsar, Rutuja Surve, Sergey M, Siyuan Shi, Szymon Skorupinski, Tadej Svetina, Wang Bo, Yannic Kilcher, Yusup, cristian, florian-hoenicke
🙏 Thanks to our Community
And thanks to all of you out there as well! Without you Jina couldn't do what we do. Your support means a lot to us.
🤝 Work with Jina
Want to work with Jina full-time? Check out our openings on our website.
🎉 Release v0.9.1
Release Note (0.8.22
)
Release time: 2021-01-03 23:25:16
🙇 We'd like to thank all contributors for this new release! In particular,
Han Xiao, Jina Dev Bot, 🙇
📗 Documentation
- [
3178fef8
] - update readme (Han Xiao)
🍹 Other Improvements
- [
2e7786cb
] - version: the next version will be 0.8.22 (Jina Dev Bot)
🎉 Jina 0.9.0
Jina 0.9.0
We are excited to release Jina 0.9.0. Jina is the easier way to do neural search in the cloud. Highlights of this release include:
- Support for delete/update operations
- Add native AsyncIO support and unlock native support for running Jina in Jupyter notebooks and IPython
- Add
MultiModalDocument
as primitive types to support multimodal search in a Pythonic way - Refactor
Pea
and introduceRuntime
to improve code readability and maintainability
Release 0.9.0
⬆️ Major Features and Improvements
Completeness
-
To support for updating and deletion operations, we introduced
update
anddelete
method forBaseKVIndexer
andBaseVectorIndexer
;update
anddelete
APIs are introduced in Flow APIs. #1380, #1415, #1550, #1460 -
Refactoring to native asyncio. This unlocks support for running Jina in Jupyter notebooks and IPython.
AsyncClient
andAsyncFlow
were added to let users manageeventloop
and make Jina more reliable in Jupyter notebooks and IPython #1348, #1408, #1410, #1428, #1450, #1453, #1463, #1562
Click to see example
from jina import AsyncFlow
with AsyncFlow().add(uses='_logforward') as f:
await f.index_lines(lines=['hello', 'jina'], on_done=print)
Ease of Use
- Add
MultiModalDocument
as primitive type. This lets users build a multimodal search system in a Pythonic way. #1335, #1385, #1390, #1368, #1395, #1399, #1401
Click here for example code
from jina import Document, MultimodalDocument
chunk_img = Document(modality='dummy_image', embedding=np.random.rand(1, 4))
chunk_text = Document(modality='dummy_text', embedding=np.random.rand(1, 10))
multimodal_doc = MultimodalDocument(chunks=[chunk_img, chunk_text])
- Introduce
Runtime
as a member ofPea
, defined as "a procedure that blocks the main process once running, therefore must be put into a separated thread/process. The new architecture greatly improves the readability and maintainability of the code. #1426, #1473, #1487, #1539, #1577
⚠️ Breaking Changes
- Introduce
UniqueId
,ChunkSet
,DocumentSet
,MatchSet
; Removeadd_chunk
andadd_match
; RefactorDocument
with newly introduced classes. #1343
Click here for example code
0.8.0 | 0.9.0 |
from jina import Document()
with Document() as d:
c = Document(id=f'1:0>16')
d.chunks.append(c)
with Document() as d:
c = d.chunks.append()
c.id = f'1:0>16' |
from jina import Document()
with Document() as d:
c = d.chunks.add_chunks()
c.id = f'1:0>16' |
-
Refactor YAML file parsing backend from
ruamel.yaml
topyyaml
and introducejina.jaml
for parsing YAML files. The dependency onruamel.yaml
is deprecated. #1495, #1516, #1524, #1533, #1547, #1581 -
Add
_merge_matches
and_merge_chunks
for merging messages in different ways. Remove_merge_all
. #1406 #1418 -
PyClient
renamed toClient
for simplicity #1450
📗 Documentation
- Update Korean Readme #1364 @doomdabo
- Add code review guide #1397
- Fix typos in
helloworld.html
. #1405 @harry-stark - Add documentation for recursive data structure. #1394
- Fix redundant translation in Chinese Readme #1443 @smy0428
- Fix missing CLI content. #1481
- Fix typos in
README.md
. #1500 @Kavan72 - Improve docs.jina.ai. #1513, #1514, #1586
- Improve
TorchDevice
docstring #1499 @tadej-redstone - Fix typos in Russian Readme #1544, #1572 @git-webmaster
- Fix typos in CLI interface #1578 @xinbinhuang
- Add Spanish Readme #1579 @PabloRN
🐞 Bug Fixes and Other Changes
Flow
- Fix issue terminating
RemotePea
#133 - Refactor Pea closing logic #1379, #1398, #1457
- Refactor
peapods
code base #1421 - Add versioning for Flow YAML config files. Introduce
method
field for Flow YAML configurations. #1442 - Add
env
filed for Flow and Pod YAML configuration so that shared environment variables can be set. #1446, #1448 - Rename Flow
output
argument toon_done
. #1476 - Fix client
top_k
malfunctioning bug. #1522 - Add
return_list
option for Flow API and introduceResponse
as new primitive type. Whenreturn_list=True
, return results are a list ofResponse
objects to make it easy to interpret. #1541 - Fix CORS behavior bug for REST API #1568 @yk
Executors
- Change default metric of
NumpyIndexer
to cosine #1393 - Remove deprecated
jina/executors/encoders/helper.py
#1563 @tadejsv - Introduce
batching_multi_input
decorator to add batching support for rankers #1467 @deepampatel - Allow Indexers to have separate workspaces. #1383
- Fix bug when shards are empty #1340, #1396
Drivers
- Add
op_name
forMatches2DocRankDriver
#1409 - Add
batch_size
argument forEncodeDriver
to enable batching on driver level #1483 - Make DocIdCache capable of detecting collisions on content level #1510
- Enable
AggregateMatches2DocRankDriver
for keeping chunks of matches #1494
Types
- Add
NamedScore
as new primitive type. #1430 - Support
+
and+=
operations forDocument
. #1555 - Move
extract_content()
toDocumentSet
. Instead of usingdocs = DocumentSet(random_docs(2)); extract_content(docs)
,docs.all_contents()
makes it easier to get contents from a set ofDocument
s. #1387 - Refactor
random_id
and introducecontent_hash
field inDocument
. #1440
Tests
- Improve unit tests for
test_hello_world
#1305 - Refactor unit tests for
queryset
#1336 - Refactor unit tests for evaluation #1339
- Refactor unit tests for index remote #1346
- Fix integration tests for
jinad
#1367, #1388, #1407 - Refactor
random_docs()
in unit tests #1356 - Add unit tests for convert functions in
Document
#1389 - Fix callbacks in unit tests.
callback
failures had chance of being not captured by tests #1391 - Fix integration tests for evaluation #1411
- Refactor doctrings in unit tests of
QueryLangSet
#1417 - Fix bug failing to capture errors of callbacks during unit tests. #1419, #1536
- Refactor unit tests for
types
#1435 - Refactor unit tests for
request
#1445 - Add unit tests for corner cases in calculating similarity metrics #1434
- Add evaluation option for hello-world #1465, #1488, #1508, #1501,
- Add test for loading customized drivers #1474
- Refactor unit test for
drivers
#1452 - Set default value of
eval_at
inPrecisionEvaluator
andRecallEvaluator
toNone
#1552 - Fix unit tests of
test_hub_usage
whenGITHUB_TOKEN
is used. #1560 - Refactor unit tests for
drivers
#1559 - Refactor unit tests in
hubio
to useBuildTestLevel
#1361 - Fix naming for
test_rankingevaluation_driver
#1573
HubIO
- Fix Jina Hub automated updates and add GA for updating Jina Hub images. Check out more details at hub-updater #1298, #1345, #1360, #1456
- Redefine naming convention of Docker images in Jina Hub. Naming follows
{repository}/{type}.{kind}.{name}:{version}-{jina_version}
#1341 - Avoid overwriting Docker image in Jina Hub when tag already exists. #1365
- Clean up
hubio
imports. #1381 - Fix
hubio
version checking and add--no-overwrite
option forjina hub --push
#1403 - Fix
hubio
test levels #1361 - Add
--timeout-ready
option forhubio
#1525 - Fix typo in error message #1531
- Fix access to token credential file for
jina hub push
#1492 - Switch to
hubapi
for retrieving Docker login information #1429, #1589
Others
- Adapt to new remote log APIs #1300
- Adapt to Docker SDK 4.4.0 in
ContainerPea
#1334 - Move log parser from
jinad
to core. #1342 - Use
load_config
directly as a classmethod #1352, #1354 - Fix bug during completing file path for errors #1353
- Fix top-k setting bug #1359
- Fix newlines for autocompletion in bash. #1425 @lsgrep
- Fix latency check during CI #1437
- Add client-side exception handlers #1458, #1462,
- Add GA for automated comments on lint failures. #1486, #1507, #1519
- Introduce
ArgNamespace
injina.helper
to manage all namespace-related operations #1489 - Introduce training. #1518
- Introduce
jina.jaml
for parsing YAML files. #1533, #1547, #1581 - Fix bug in parsing config source files #1583
🙏 Thanks to our Contributors
This release contains contributions from Amritpal Singh, Bithiah Yuan, CatStark, Deepam Patel, Deepankar Mahapatro, Florian Hönicke, Han Xiao, Harry Stark, Hidan, Joan Fontanals, Nan Wang, Pratik Bhavsar, Rutuja Surve, Sergey M, Siyuan Shi, Szymon Skorupinski, Tadej Svetina, Wang Bo, Yannic Kilcher, Yusup, cristian, florian-hoenicke
🙏 Thanks to our Community
And thanks to all of you out there as well! Without you Jina couldn't do what we do. Your support means a lot to us.
🤝 Work with Jina
Want to work with Jina full-time? Check out our openings on our [website](https://jina.ai/...
🎉 Release v0.8.22
Release Note (0.8.21
)
Release time: 2021-01-03 16:37:08
🙇 We'd like to thank all contributors for this new release! In particular,
Han Xiao, 🙇
🏁 Unit Test and CICD
- [
a431537f
] - fix github-push-action default branch (Han Xiao) - [
b2430fbd
] - fix tag release order (Han Xiao)
🍹 Other Improvements
- [
b2689933
] - hotfix release (Han Xiao)
🎉 Release v0.8.21
Release Note (0.8.18
)
Release time: 2021-01-01 15:44:56
🙇 We'd like to thank all contributors for this new release! In particular,
Han Xiao, Florian Hönicke, Jina Dev Bot, 🙇
🐞 Bug fixes
🚧 Code Refactoring
📗 Documentation
🍹 Other Improvements
🎉 Release v0.8.20
Release Note (0.8.18
)
Release time: 2021-01-01 15:44:56
🙇 We'd like to thank all contributors for this new release! In particular,
Han Xiao, Florian Hönicke, Jina Dev Bot, 🙇
🐞 Bug fixes
🚧 Code Refactoring
📗 Documentation
🍹 Other Improvements
🎉 Release v0.8.19
Release Note (0.8.18
)
Release time: 2021-01-01 15:44:56
🙇 We'd like to thank all contributors for this new release! In particular,
Han Xiao, Florian Hönicke, Jina Dev Bot, 🙇
🐞 Bug fixes
🚧 Code Refactoring
📗 Documentation
🍹 Other Improvements
🎉 Jina 0.8.0
We are excited to release Jina 0.8.0. Jina is an easier way to do neural search on the cloud. Highlights of this release include:
- Introduce jinad to improve experience of using remote Flows/Pods/Peas
- Add support for multimodal search
SparseArray
- Add
jina.types
module to offer Pythonic interface to access and manipulate protobuf objects.
Release 0.8.0
⬆️ Major Features and Improvements
Ease of Use
- We introduce two new ways of using Jina Pods remotely:
- Create a remote Pod via SSH #1275
- Create a remote Pod via jinad. Jinad is a daemon process working together with jina on remote machines. Jinad makes it even easier to deploy Jina Flows/Pods/Peas on remote machines. Find out more details in the README #1182, #1203, #1254, #1297, #1299, #1307, #1312, #1324
Click here for example code
RemoteSSHPod | Jinad API |
jina pod --host [email protected] --remote-access SSH |
jina pod --host 11.22.33.44 --port-expose 8000 --remote-access JINAD
|
With jinad, you can create and use Pods directly from the Flow as well: Start the Docker container equipped with jinad on the remote machine as follows:
sudo docker run --rm -d --network host jinaai/jinad
Now you can directly create and use the remote pods from your local machine:
f = (Flow()
.add(name='p1', uses='_logforward')
.add(name='p2', host='10.11.22.33', port_expose='8000', uses='_logforward')
with f:
f.search_lines(lines=['jina', 'is', 'cute'], output_fn=print)
- We've added
jina.types
module, which offers a Pythonic interface to access and manipulate protobuf objects. The main types includeRequest
,QueryLang
,NdArray
,Message
, andDocument
. With the help of Jina types, you can construct inputs to Jina much more easily than before. #1283, #1284, #1289, #1323
Click here for example code
v0.7.0 | v0.8.0 | |
Document |
from jina.proto import jina_pb2
d = jina_pb2.DocumentProto()
d.text = 'hello world' |
from jina import Document
d = Document()
d.text = 'abc'
|
Request |
from jina.proto import jina_pb2
r = jina_pb2.Request()
d = r.docs.add() |
from jina.types.request import Request
from jina.types.document import Document
r = Request()
d = Document()
r.add_document(d)
|
Message |
from jina.proto import jina_pb2
r = jina_pb2.RequestProto.IndexRequestProto()
m = jina_pb2.MessageProto()
m.envelop = None
m.request = r |
from jina.types.message import Message
from jina.types.request import Request
r = Request()
m = Message(None, r)
|
QueryLang |
from jina.proto import jina_pb2
ql = jina_pb2.QueryLangProto(name='SliceQL')
ql.parameters['start'] = 1
ql.parameters['end'] = 3 |
from jina.types.querylang import QueryLang
ql = QueryLang(SliceQL(start=1, end=3))
|
NdArray |
from jina.proto import jina_pb2
from jina.drivers.helper import array2pb
a = jina_pb2.jina_pb2.NdArrayProto()
a.CopyFrom(array2pb(np.ndarray([2, 17]))) |
from jina.types.ndarray.generic import NdArray
a = NdArray()
a.value = np.ndarray([2, 17])
|
Completeness
-
To support multimodal search, we've introduced
BaseMultiModalEncoder
andMultimodalDriver
. Check out how to search fashion items with text and images together at Jina examples. #1141, #1144, #1154, #1156 -
We've introduced
Classifiers
, a new type of executor. With the help ofClassifier
, the new executor is designed to enrich the Documents with tags. Check out more details at docs.jina.ai #1194
⚠️ Breaking Changes
-
Refactor drivers for evaluation from function-based to type-based. #1165
- Removed
EncodeEvaluationDriver
andCraftEvaluationDriver
TextEvaluateDriver
,NDArrayEvaluateDriver
, andFieldEvaluateDriver
RankingEvaluationDriver
renamed toRankEvaluateDriver
- Removed
-
Introduce
SparseNdArray
and provide generic interface forSparseNdArray
andDenseNdArray
#1190, #1283
Click here for example code
v0.7.0 | v0.8.0 | |
dense array |
from jina.proto import jina_pb2
from jina.proto import jina_pb2
from jina.drivers.helper import array2pb
a = jina_pb2.jina_pb2.NdArrayProto()
a.CopyFrom(array2pb(np.ndarray([2, 17])))
|
from jina.types.ndarray.generic import NdArray
a = NdArray()
a.value = np.ndarray([2, 17])
|
sparse array |
not support
|
from jina.types.ndarray.generic import NdArray
from .sparse.scipy import SparseNdArray
from scipy.sparse import coo_matrix
row = np.array([20, 0])
col = np.array([0, 20])
data = np.array([2, 17])
a = NdArray(is_sparse=True, sparse_cls=SparseNdArray)
a.value = coo_matrix((data, (row, col)), shape=(21, 21))
|
- Add
callback_on
andcontinue_on_error
fot the client.callback_on_body
is removed. #1265
Click here for example code
v0.7.0 | v0.8.0 | |
from jina.flow import Flow
f = (Flow().add(name='p1').add(name='p2'))
with f:
f.search_lines(lines=['hello', 'jina'], callback_on_body=True)
|
from jina.flow import Flow
f = (Flow().add(name='p1').add(name='p2'))
with f:
f.search_lines(lines=['hello', 'jina'], callback_on='body')
|
- Add
ProtoMessage
,LazyRequest
to replace the originaljina_pb2.Message
andjina_pb2.Request
so that the protobuf message is deserialized in a lazy way #1210, #1283
Click here for example code
v0.7.0 | v0.8.0 | |
from jina.proto import jina_pb2
r = jina_pb2.RequestProto.IndexRequestProto()
m = jina_pb2.MessageProto()
m.envelop = None
m.request = r
|
from jina.types.message import Message
from jina.types.request import Request
r = Request()
m = Message(None, r)
|
🐞 Bug Fixes and Other Changes
Flow
- Fix argument overridden bug for Pod when passing arguments from Flow #1189
- Refactor
num_part
logic #1247 - Enable client to interpret
dict
ofjson-like str
into parsed documents #1282 - Besides
callback
function forFlow
API, three more actions added for postprocessing requestson_done
,on_error
,on_always
#1303
Protos
Drivers
- Refactor over-reduce logic to
BaseDriver
. MoveReduceDriver
function intoBaseDriver
. MergePassDriver
andRouteDriver
intoRouteDriver
#1228 - Adapt the Drivers to the
jina.type
#1313,
Tests
- Remove pip cache from Docker images #1168
- Refactor unit tests for
ContainerPea
to pytest #1179 - Switch back to use S3 bucket instead of GitHub for accessing fashionmnist dataset #1183
- Refactor unit tests for
CompoundExecutors
to pytest #1192 - Refactor unit tests for
hello-world
to pytest #1263 - Refactor unit tests for indexing to pytest. #1258, #1237
- Add unit tests for southpark example #1218
- Fix flaky test #1219
- Remove legacy code #1291, #1314
- Adapt unit tests to
jina.type
#1319, #1320, #1322
Usability
- Add
--repository
option forjina hub
cli so users can push Pod images to their own repository. #1175 - Replace
id_tag
argument withfield
inRankEvaluateDriver
so users can access all fields ofmatches
#1176
Documentation
🎉 release v0.7.0
Jina v0.7.0
We are excited to release Jina v0.7.0. Jina is an easier way to do a neural search on the cloud. Highlights of this release include:
Flow
evaluation support- Support for preventing duplicates Documents in the index
Flow
visualization support
Release v0.7.0
⬆️ Major Features and Improvements
Completeness
- Evaluation is fully supported by Jina.
jina.executors.evaluators
andjina.drivers.evaluate
have been introduced to make this happen. Now you can use different metrics to evaluate the Flow. No matter whether you want to evaluate the whole Flow or just part of it, the evaluation can be done smoothly without stopping the running Flow. #1043, #1086, #1087, #1090, #1092, #1099, #1100, #1102, #1114, #1134
Click here to see the example codes
code | index-doc.yml | eval.yml |
from jina.flow import Flow
from jina.proto import jina_pb2
from jina.drivers.helper import array2pb
import numpy as np
def get_index_docs():
doc0 = jina_pb2.Document()
doc0.tags['id'] = '0'
doc0.embedding.CopyFrom(array2pb(np.array([1, 1])))
doc1 = jina_pb2.Document()
doc1.tags['id'] = '1'
doc1.embedding.CopyFrom(array2pb(np.array([1, -1])))
return [doc0, doc1]
# indexed two docs
f_index = (Flow().add(uses='index-doc.yml'))
with f_index:
f_index.index(input_fn=get_index_docs)
def get_eval_docs():
doc = jina_pb2.Document()
doc.embedding.CopyFrom(array2pb(np.array([1, 1])))
groundtruth = jina_pb2.Document()
match0 = groundtruth.matches.add()
match0.tags['id'] = '0'
match1 = groundtruth.matches.add()
match1.tags['id'] = '2'
return [(doc, groundtruth), ]
def validate(resp):
# retrieved docs with id `0` and `1`
# relevant docs with id `0` and `2`
# Precision@2 = 0.5
assert resp.docs[0].evaluations[0].value == 0.5
# evaluate Precision@2
f_eval = (Flow()
.add(uses='index-doc.yml')
.add(uses='eval.yml'))
with f_eval:
f_eval.search(
input_fn=get_eval_docs,
output_fn=validate,
callback_on_body=True)
|
!CompoundIndexer
components:
- !NumpyIndexer
metas:
name: vecidx
- !BinaryPbIndexer
metas:
name: docidx
requests:
on:
IndexRequest:
- !VectorIndexDriver
with:
executor: vecidx
traversal_paths: ['r']
- !KVIndexDriver
with:
executor: docidx
traversal_paths: ['r']
SearchRequest:
- !VectorSearchDriver
with:
executor: vecidx
traversal_paths: ['r']
- !KVSearchDriver
with:
executor: docidx
traversal_paths: ['m']
|
!PrecisionEvaluator
with:
eval_at: 2
id_tag: 'id' |
- To prevent duplicates in the index,
UniquePbIndexer
andUniqueVectorIndexer
are introduced together with the corresponding drivers injina.drivers.cache
. Please refer to docs.jina.ai for more details. #1064, #1081, #1147
Click here to see the example codes
from jina.flow import Flow
from jina.proto import jina_pb2
doc_0 = jina_pb2.Document()
doc_0.text = f'I am doc0'
doc_1 = jina_pb2.Document()
doc_1.text = f'I am doc1'
def assert_num_docs(rsp, num_docs):
assert len(rsp.IndexRequest.docs) == num_docs
f = Flow().add(
uses='NumpyIndexer', uses_before='_unique')
with f:
f.index(
[doc_0, doc_0, doc_1],
output_fn=lambda rsp: assert_num_docs(rsp, num_docs=2))
Usability
- Add visualization for Flow. Calling
plot()
function ofFlow
gives a better view of how the Flow looks. #1002, #1116
⚠️ Breaking Changes
-
Document.id
,Document.parent_id
andRelevance.ref_id
are nowstring
types instead ofint
. Please refer to docs.jina.ai for more details. #1005, #1034, #1136 Accordingly, the following changes are made,SortQL.field
now usesdunder_get
syntax rather than.
expansion (e.g.a.b.c -> a__b__c
,score.value -> score__value
) and now supportsdict
andlist
access.first_doc_id
,random_doc_id
andoverride_doc_id
have been removed from CLI.
-
Refactor logger config into YAML. Add
--log-config
tojina pea
CLI, by default it points tologging.default.yml
.--log-sse
,--log-profile
,--log-with-own-name
are deprecated. #1031
Click here to check how the loggers are mapped to different resource files:
Filename | Logger in the code |
---|---|
logging.default.yml | default_logger and any logger defined with JinaLogger() |
logging.docker.yml | logger used in the ContainerPea |
logging.profile.yml | profile_logger |
logging.remote.yml | logger used in the RemotePea |
-
Refactor the codes for traversing recursive Documents. Replaced by
traversal_paths
,granularity_range
,adjacency_range
,recur_on
andrecursion_order
are deprecated. This allows us to specify where the traversal should happen in an exact way. #995, #998, #1001, #1003, #1006, #1007, #1027, #1036, #1044 -
Protobuf
request_id
is nowstring
type.--first-request-id
removed from client CLI.--query-uses
and--index-uses
from hello-world CLI now renamed to--uses-query
and--uses-index
. #1049
🐞 Bug Fixes and Other Changes
Flow
- Refactor log stream server with
fluentd
. Flunetd acts as a daemon collecting logs from different parts of Jina and forwarding them to a specific output. Check out more details at docs.jina.ai #1002, #999 - Add
ordinal_idx_arg
for batching decorator to support passing ordinal index to indexers #1089 - Refactor
request_id
to uuid #1049 - Refactor logger wrapper #1029
- Add ssh tunneling for Pod. You can specify ssh information #1018
- Switch to hash function for generating ids #1005, #1034
- Support to use
--uses-before
and--uses-after
when--parallel=1
. Both options only act on whenparallel > 1
._pass
and_forward
are usingRouteDriver
by default. #1112 - Rename
replica_id
topea_id
and fix thePeaRoleType
#1015 - Fix the bug in setting
top_k
#1133 #1138 #1145
Executors
- Add checking for the existence of model paths #1077
- Improve exception handling for the failure of loading pre-trained models #1065
- Fix typing of indexers #1053
- Fix the no attribute error for BaseOnnxEncoder #1107
Drivers
- Fix bug in
QueryDriver
when passing dictionary argument. #1080
CLI
-
Improve the hubio module.
jina hub login
supports to login with the OAuth authentification.jina hub list
is for list the available pods in the jina-hub.jina hub push
support to build and push the pod images via Hubapi deployed on AWS API Gateway #1022, #1041, #1118, #1120, #1135 -
Add the update checking for jina cli #1117
Tests & CICD
- Refactor test for Python client #1095
- Add tests for including examples during ci #1088
- Fix dependency conflicts in ci by replacing
[match-py-ver]
with[cicd]
#1101 - Improve PR review process by adding
CODEOWNERS
#1108 - Refactor to pytest in testing
request
#1045 - Add unit test for helper #1046
- Fix io test #1052
- Fix test coverage #1054, #1056
- Use
pytest
fixture to removetmp
files #1021 - Refactor the unit tests to
pytest
style intest_protobuf
#1121 - Add docker helper test #1115
- Add test in the ci for testing examples #1142
- Add test in the ci for testing hello-world in docker with no devel installed #1139
Documentation
- Add Portuguese translation for
README
#1097 - Add Ukrainian translation for
README.md
#1124 - Fix Russian
README
#1057 - Fix broken links in
README
#1033, #1037, #105 - Fix links in
CHANGELOG
andCONTRIBUTING
#1032 - Improve the docstring for rank drivers #1143
Others
- Fix duplicate lines in cookiecutter #1063
- Fix conflicts between copyright adding action and typing #1023
- Move
numpy
importing inside function #1019 - Rename
jina_cli
tocli
#1017 - Fix typing error in
mypy
#1009 - Fix line spaces in code #1105
🙏 Thanks to our Contributors
This release contains contributions from Alex C-G, Alex McKenzie, CatStark, Christopher Lennan, Deepankar Mahapatro, Fernanda Kawasaki, Han Xiao, Joan Fontanals Martinez, Ján Jendrušák, Maximilian Werk, Nan Wang, Oleh Yaroshchuk, Pratik Bhavsar, RenrakuRunrat, Rutuja Surve, Sai Sandeep Mutyala, Sergei Averkiev, Susana Guzman, Wang Bo, jancijen, pswu11
🙏 Thanks to our Community
And thanks to all of you out there as well! Without you, Jina couldn't do what we do. Your support means a lot to us.
🤝 Work with Jina
Want to work with Jina full-time? Check out our openings on our website.
🎉 v0.6.0
Jina v0.6.0
We are excited to release Jina 0.6.0. Jina is the easier way to do neural search on the cloud. Highlights of this release include:
- Improve the memory footprint for the
Indexer
. - Add an example for building a cross-modal search system with Jina.
- Add support for indexing
.pdf
files.
Release 0.6.0
⬆️ Major Features and Improvements
Scalability
- Improve the memory footprint for the
Indexer
. Instead of using the in-memory index during the query mode, both theNumpyIndexer
and theBinaryPbIndexer
use the memory mapping to better support scaling out for large datasets. To further improve the memory footprint for the vector index,ZarrIndexer
based on Zarr has been added to Jina Hub. #950, #984.
Universal
- Add an example for building a cross-modal search system with Jina. #978
- Add support for indexing
.pdf
files.PdfExtractor
has been added to Jina Hub. #981
⚠️ Breaking Changes
For details of all breaking changes, please refer to #885
- Improve the way of traversing recursive document structure. #944, #933, #923, #893, #889,
- Rename
--yaml-path
to--uses
in Flow CLI #925, #922 - Rename
--uses-reducing
to--uses-after
and add--uses-before
. This change enables us to customize the executors' behaviors before sending them to and after receiving from all parallels/shards. #925
🐞 Bug Fixes and Other Changes
Flow
- Improve context management of Flow and Pod with ExitStack. #901,
- Improve shut-down logic for log server #935, #958
- Fix shut-down logic for Peas and Pods #907, #956
- Refactor de-/serialization logic #988, #991
Executors
- Add a meta variable
force_register
for executors in order to force Jina to use local version of executor. #883 - Fix a bug in reducing functions for encoders. #900
- Fix default behavior of
CompoundIndexer
#939 - Fix bug in overwriting metas using Python client. #980
Drivers
- Add
CollectMatches2DocRankDriver
for calculatingmatches
withgranularity=k-1
fromMatches
atgranularity=k
. #851 - Add
Matches2DocRankDriver
for calculating new scores ofmatches
from original scores #919 - Add
VectorFillDriver
for filling embeddings of Document 2 #909, #913 - Add support for using
tags
withQueryLangDrivers
#938 - Add support for traversing recursive Documents via explicit tree path definition. #983, #979, #994, #993
- Enable
BaseSegmenter
to changemime_type
. #981 - Add
NdArray2PngURI
andBlob2PngURI
for convert numpy arrays into data URI. #982
CLI
- Add
--test-uses
option forjina hub build
CLI for skipping failed-start peas when building Docker file. #902, #965 - Add
is_build_success
field for checking results ofjina hub build
. #903 - Add
--type app
option forjina hub new
CLI for creating a new Jina app. #917 - Add
--push
option forjina hub build
CLI for building and pushing local executors to Jina Hub. #937 - Improve
jina hub list
CLI. #985 - Improve speed of CLI autocompletion. #992
Tests
- Add more unit tests for reducing functions 1 #898
- Move dependencies for unit tests into
extra-requirements.txt
#906 - Add unit tests for sleeping executors #918
- Add more unit tests for checking Peas #921
- Add more unit tests for decorators of executors. #930
- Add more unit tests for overriding Flow arguments. #926, #927
- Fix name conflicts in test when running unit tests on Github. #961
- Add more unit tests for support of Documents with chunks of different
mime_type
, #968
Documentation
- Improve documentation for drivers #886, #888, #990
- Improve README #894
- Fix typos in documentation. #904, #912, #940, #978
Others
- Improve helper functions. #948, #972, #974
- Improve type of annotation. #962, #966, #967
- Improve module importing logic for classes from
jina-hub
. #967 - Improve authentification for
jina hub
#977 - Jina ❤️ Hacktoberfest. #986
🙏 Thanks to our Contributors
This release contains contributions from Alasdair Tran, Alex C-G, David Sanwald, Deepankar Mahapatro, Han Xiao, JamesTang-jinaai, Joan Fontanals Martinez, Maximilian Werk, Nan Wang, Rutuja Surve, Sreerag-ibtl, Susana Guzman, Yue Liu, pswu11, rameshwara
🙏 Thanks to our Community
And thanks to all of you out there as well! Without you, Jina couldn't do what we do. Your support means a lot to us.
🤝 Work with Jina
Want to work with Jina full-time? Check out our openings on our website.