Started introducing cloudflow-contrib thoughts into cloudflow docs#1046
Conversation
| In the current version, Cloudflow includes backend `Streamlet` implementations for Akka, Apache Spark - Structured Streaming, and Apache Flink. | ||
| Using these implementations you can write business logic in the native API of the backend. | ||
| Additionally, Cloudflow can be extended with new streaming backends. | ||
| Native support for Flink and Spark streamlets are supported as _legacy_ versions and will be discontinued in future. The current version introduces Flink and Spark integrations with more controls in the hands of the users. Users now have more control on deployment and management of Flink and Spark streamlets while still using the same Cloudflow streamlet API for developing their business logic. However, Akka will continue to be supported natively as in earlier versions. |
There was a problem hiding this comment.
I think that we should not yet announce the current integration as "legacy"
There was a problem hiding this comment.
I agree. Also native support for, and flink native, seems to have been mixed. We're not going to discontinue the native integration (through cli native kubernetes features). We have to come up with a better name for the current integration style
There was a problem hiding this comment.
Maybe supported by the cloudflow operator vs via the CLI?
There was a problem hiding this comment.
ok, I will make it sound like both the strategies are available.
| Using these implementations you can write business logic in the native API of the backend. | ||
| Additionally, Cloudflow can be extended with new streaming backends. | ||
| Native support for Flink and Spark streamlets are supported as _legacy_ versions and will be discontinued in future. The current version introduces Flink and Spark integrations with more controls in the hands of the users. Users now have more control on deployment and management of Flink and Spark streamlets while still using the same Cloudflow streamlet API for developing their business logic. However, Akka will continue to be supported natively as in earlier versions. | ||
| Integration support for Flink and Spark, thus being externalized, makes Cloudflow easily extensible with new streaming backends. |
There was a problem hiding this comment.
The new integration should be incentivized but marked "experimental" for now
There was a problem hiding this comment.
mentioned that ..
|
|
||
| This method is not only dev-friendly, but is also compatible with the typical CI/CD deployments. | ||
| This allows you to take the application from dev to production in a controlled way. | ||
| The deployment procedure will be a bit different with the _cloudflow contrib_ approach where Flink and Spark applications are supported through external plugins. Akka applications will be fully depoyed using `kubectl cloudflow` as above. However for Spark and Flink applications, you need to use an extra plugin and carry out a few extra steps to make them known to the cloudflow engine. For details please have a look at xref:develop:cloudflow-contrib-change-me.adoc[Cloudflow Contrib] documentation. |
There was a problem hiding this comment.
| The deployment procedure will be a bit different with the _cloudflow contrib_ approach where Flink and Spark applications are supported through external plugins. Akka applications will be fully depoyed using `kubectl cloudflow` as above. However for Spark and Flink applications, you need to use an extra plugin and carry out a few extra steps to make them known to the cloudflow engine. For details please have a look at xref:develop:cloudflow-contrib-change-me.adoc[Cloudflow Contrib] documentation. | |
| The deployment procedure will be a bit different with the _cloudflow contrib_ approach where Flink and Spark applications are supported through external plugins. Akka applications will be fully deployed using `kubectl cloudflow` as above. However for Spark and Flink applications, you need to use an extra plugin and carry out a few extra steps to make them known to the Cloudflow engine. For details please have a look at xref:develop:cloudflow-contrib-change-me.adoc[Cloudflow Contrib] documentation. |
|
@andreaTP , @RayRoestenburg Do we need to make any more changes for the version of Cloudflow where we offer both the options - the current operator based management for Spark and Flink and the new implementation through |
| In the current version, Cloudflow includes backend `Streamlet` implementations for Akka, Apache Spark - Structured Streaming, and Apache Flink. | ||
| Using these implementations you can write business logic in the native API of the backend. | ||
| Additionally, Cloudflow can be extended with new streaming backends. | ||
| Along with the currently supported built-in integration of Flink and Spark streamlets via the Cloudflow operator, Cloudflow also supports external integrations for these streaming platforms through additional plugins. The current version introduces Flink and Spark integrations with more controls in the hands of the users. Users now have more control on deployment and management of Flink and Spark streamlets while still using the same Cloudflow streamlet API for developing their business logic. However, Akka will continue to be supported natively as in earlier versions. |
There was a problem hiding this comment.
You can remove:
The current version introduces Flink and Spark integrations with more controls in the hands of the users.
And start the following sentence with something like:
Using this new integration ...
| Additionally, Cloudflow can be extended with new streaming backends. | ||
| Along with the currently supported built-in integration of Flink and Spark streamlets via the Cloudflow operator, Cloudflow also supports external integrations for these streaming platforms through additional plugins. The current version introduces Flink and Spark integrations with more controls in the hands of the users. Users now have more control on deployment and management of Flink and Spark streamlets while still using the same Cloudflow streamlet API for developing their business logic. However, Akka will continue to be supported natively as in earlier versions. | ||
| Integration support for Flink and Spark, thus being externalized, makes Cloudflow easily extensible with new streaming backends. The new externalized integration has been marked _Experimental_ in the current version. | ||
| For more details on externalized Flink and Spark integrations, please have a look at https://lightbend.github.io/cloudflow-contrib/docs/0.0.4/index.html[Cloudflow Contrib] documentation. |
There was a problem hiding this comment.
use as a link:
https://lightbend.github.io/cloudflow-contrib/
the redirect is performed there
|
|
||
| This method is not only dev-friendly, but is also compatible with the typical CI/CD deployments. | ||
| This allows you to take the application from dev to production in a controlled way. | ||
| The deployment procedure will be a bit different with the _cloudflow contrib_ approach where Flink and Spark applications are supported through external plugins. Akka applications will be fully deployed using `kubectl cloudflow` as above. However for Spark and Flink applications, you need to use an extra plugin and carry out a few extra steps to make them known to the Cloudflow engine. For details please have a look at https://lightbend.github.io/cloudflow-contrib/docs/0.0.4/index.html[Cloudflow Contrib] documentation. |
There was a problem hiding this comment.
| The deployment procedure will be a bit different with the _cloudflow contrib_ approach where Flink and Spark applications are supported through external plugins. Akka applications will be fully deployed using `kubectl cloudflow` as above. However for Spark and Flink applications, you need to use an extra plugin and carry out a few extra steps to make them known to the Cloudflow engine. For details please have a look at https://lightbend.github.io/cloudflow-contrib/docs/0.0.4/index.html[Cloudflow Contrib] documentation. | |
| The deployment procedure will be different with the _cloudflow contrib_ approach where Flink and Spark applications are supported through external plugins. Akka applications will be fully deployed using `kubectl cloudflow` as above. However, for Spark and Flink applications, you need to use an extra plugin and carry out a few extra steps to make them known to the Cloudflow engine. For details please have a look at https://lightbend.github.io/cloudflow-contrib[Cloudflow Contrib] documentation. |
| - The Flink Job Manager then requests task manager resources from Kubernetes to deploy the distributed processing. | ||
| - Finally, if and when resources are available, the Flink-bound task managers start as Kubernetes pods. The task managers are the components tasked with the actual data processing, while the Job Manager serves as coordinator of the (stream) data process. | ||
|
|
||
| In case you are using the cloudflow-contrib model of integration, you need to go through some additional steps to complete the deployment of your Flink streamlets. This https://lightbend.github.io/cloudflow-contrib/docs/0.0.4/get-started/flink-native.html[section] on cloudflow-contrib has more details. |
There was a problem hiding this comment.
Just thought of adding this here since there is no mention of cloudflow-contrib in this section on flink streamlets.
| In this architecture, the Spark driver runs the Cloudflow-specific logic that connects the streamlet to our managed data streams, at which point the streamlet starts consuming from inlets. | ||
| The streamlet advances through the data streams that are provided on inlets and writes data to outlets. | ||
|
|
||
| In case you are using the cloudflow-contrib model of integration, you need to go through some additional steps to complete the deployment of your Spark streamlets. This https://lightbend.github.io/cloudflow-contrib/docs/0.0.4/get-started/spark-native.html[section] on cloudflow-contrib has more details. |
There was a problem hiding this comment.
Just thought of adding this here since there is no mention of cloudflow-contrib in this section on spark streamlets.
What changes were proposed in this pull request?
Introducing cloudflow-contrib thoughts into cloudflow documentation for the version where both options co-exist
Why are the changes needed?
The changes are required so that the users now have the option to use the externalized integration of Flink and Spark as implemented in
cloudflow-contribDoes this PR introduce any user-facing change?
Yes,
cloudflow-contribis also an option now for users to use Flink and SparkHow was this patch tested?
No testing, just documentation changes