Started introducing cloudflow-contrib thoughts into cloudflow docs by debasishg · Pull Request #1046 · lightbend/cloudflow

debasishg · 2021-05-07T05:33:16Z

What changes were proposed in this pull request?

Introducing cloudflow-contrib thoughts into cloudflow documentation for the version where both options co-exist

Why are the changes needed?

The changes are required so that the users now have the option to use the externalized integration of Flink and Spark as implemented in cloudflow-contrib

Does this PR introduce any user-facing change?

Yes, cloudflow-contrib is also an option now for users to use Flink and Spark

How was this patch tested?

No testing, just documentation changes

andreaTP · 2021-05-07T09:32:12Z

 In the current version, Cloudflow includes backend `Streamlet` implementations for Akka, Apache Spark - Structured Streaming, and Apache Flink. 
 Using these implementations you can write business logic in the native API of the backend.
-Additionally, Cloudflow can be extended with new streaming backends.
+Native support for Flink and Spark streamlets are supported as _legacy_ versions and will be discontinued in future. The current version introduces Flink and Spark integrations with more controls in the hands of the users. Users now have more control on deployment and management of Flink and Spark streamlets while still using the same Cloudflow streamlet API for developing their business logic. However, Akka will continue to be supported natively as in earlier versions.


I think that we should not yet announce the current integration as "legacy"

I agree. Also native support for, and flink native, seems to have been mixed. We're not going to discontinue the native integration (through cli native kubernetes features). We have to come up with a better name for the current integration style

Maybe supported by the cloudflow operator vs via the CLI?

ok, I will make it sound like both the strategies are available.

andreaTP · 2021-05-07T09:32:42Z

 Using these implementations you can write business logic in the native API of the backend.
-Additionally, Cloudflow can be extended with new streaming backends.
+Native support for Flink and Spark streamlets are supported as _legacy_ versions and will be discontinued in future. The current version introduces Flink and Spark integrations with more controls in the hands of the users. Users now have more control on deployment and management of Flink and Spark streamlets while still using the same Cloudflow streamlet API for developing their business logic. However, Akka will continue to be supported natively as in earlier versions.
+Integration support for Flink and Spark, thus being externalized, makes Cloudflow easily extensible with new streaming backends.


The new integration should be incentivized but marked "experimental" for now

mentioned that ..

RayRoestenburg · 2021-05-07T10:25:30Z


 This method is not only dev-friendly, but is also compatible with the typical CI/CD deployments. 
 This allows you to take the application from dev to production in a controlled way.
+The deployment procedure will be a bit different with the _cloudflow contrib_ approach where Flink and Spark applications are supported through external plugins. Akka applications will be fully depoyed using `kubectl cloudflow` as above. However for Spark and Flink applications, you need to use an extra plugin and carry out a few extra steps to make them known to the cloudflow engine. For details please have a look at xref:develop:cloudflow-contrib-change-me.adoc[Cloudflow Contrib] documentation.  


Suggested change

The deployment procedure will be a bit different with the _cloudflow contrib_ approach where Flink and Spark applications are supported through external plugins. Akka applications will be fully depoyed using `kubectl cloudflow` as above. However for Spark and Flink applications, you need to use an extra plugin and carry out a few extra steps to make them known to the cloudflow engine. For details please have a look at xref:develop:cloudflow-contrib-change-me.adoc[Cloudflow Contrib] documentation.

The deployment procedure will be a bit different with the _cloudflow contrib_ approach where Flink and Spark applications are supported through external plugins. Akka applications will be fully deployed using `kubectl cloudflow` as above. However for Spark and Flink applications, you need to use an extra plugin and carry out a few extra steps to make them known to the Cloudflow engine. For details please have a look at xref:develop:cloudflow-contrib-change-me.adoc[Cloudflow Contrib] documentation.

debasishg · 2021-05-11T05:24:40Z

@andreaTP , @RayRoestenburg Do we need to make any more changes for the version of Cloudflow where we offer both the options - the current operator based management for Spark and Flink and the new implementation through cloudflow-contrib ?

andreaTP

a few more suggestions

andreaTP · 2021-05-14T15:29:40Z

 In the current version, Cloudflow includes backend `Streamlet` implementations for Akka, Apache Spark - Structured Streaming, and Apache Flink. 
 Using these implementations you can write business logic in the native API of the backend.
-Additionally, Cloudflow can be extended with new streaming backends.
+Along with the currently supported built-in integration of Flink and Spark streamlets via the Cloudflow operator, Cloudflow also supports external integrations for these streaming platforms through additional plugins. The current version introduces Flink and Spark integrations with more controls in the hands of the users. Users now have more control on deployment and management of Flink and Spark streamlets while still using the same Cloudflow streamlet API for developing their business logic. However, Akka will continue to be supported natively as in earlier versions.


You can remove:

The current version introduces Flink and Spark integrations with more controls in the hands of the users.

And start the following sentence with something like:

Using this new integration ...

andreaTP · 2021-05-14T15:30:49Z

-Additionally, Cloudflow can be extended with new streaming backends.
+Along with the currently supported built-in integration of Flink and Spark streamlets via the Cloudflow operator, Cloudflow also supports external integrations for these streaming platforms through additional plugins. The current version introduces Flink and Spark integrations with more controls in the hands of the users. Users now have more control on deployment and management of Flink and Spark streamlets while still using the same Cloudflow streamlet API for developing their business logic. However, Akka will continue to be supported natively as in earlier versions.
+Integration support for Flink and Spark, thus being externalized, makes Cloudflow easily extensible with new streaming backends. The new externalized integration has been marked _Experimental_ in the current version.
+For more details on externalized Flink and Spark integrations, please have a look at https://lightbend.github.io/cloudflow-contrib/docs/0.0.4/index.html[Cloudflow Contrib] documentation.


use as a link:

https://lightbend.github.io/cloudflow-contrib/

the redirect is performed there

andreaTP · 2021-05-14T15:31:53Z


 This method is not only dev-friendly, but is also compatible with the typical CI/CD deployments. 
 This allows you to take the application from dev to production in a controlled way.
+The deployment procedure will be a bit different with the _cloudflow contrib_ approach where Flink and Spark applications are supported through external plugins. Akka applications will be fully deployed using `kubectl cloudflow` as above. However for Spark and Flink applications, you need to use an extra plugin and carry out a few extra steps to make them known to the Cloudflow engine. For details please have a look at https://lightbend.github.io/cloudflow-contrib/docs/0.0.4/index.html[Cloudflow Contrib] documentation.  


Suggested change

The deployment procedure will be a bit different with the _cloudflow contrib_ approach where Flink and Spark applications are supported through external plugins. Akka applications will be fully deployed using `kubectl cloudflow` as above. However for Spark and Flink applications, you need to use an extra plugin and carry out a few extra steps to make them known to the Cloudflow engine. For details please have a look at https://lightbend.github.io/cloudflow-contrib/docs/0.0.4/index.html[Cloudflow Contrib] documentation.

The deployment procedure will be different with the _cloudflow contrib_ approach where Flink and Spark applications are supported through external plugins. Akka applications will be fully deployed using `kubectl cloudflow` as above. However, for Spark and Flink applications, you need to use an extra plugin and carry out a few extra steps to make them known to the Cloudflow engine. For details please have a look at https://lightbend.github.io/cloudflow-contrib[Cloudflow Contrib] documentation.

andreaTP · 2021-05-14T15:32:30Z

 - The Flink Job Manager then requests task manager resources from Kubernetes to deploy the distributed processing.
 - Finally, if and when resources are available, the Flink-bound task managers start as Kubernetes pods. The task managers are the components tasked with the actual data processing, while the Job Manager serves as coordinator of the (stream) data process.

+In case you are using the cloudflow-contrib model of integration, you need to go through some additional steps to complete the deployment of your Flink streamlets. This https://lightbend.github.io/cloudflow-contrib/docs/0.0.4/get-started/flink-native.html[section] on cloudflow-contrib has more details.


Do we need this at all?

Just thought of adding this here since there is no mention of cloudflow-contrib in this section on flink streamlets.

andreaTP · 2021-05-14T15:32:40Z

 In this architecture, the Spark driver runs the Cloudflow-specific logic that connects the streamlet to our managed data streams, at which point the streamlet starts consuming from inlets.
 The streamlet advances through the data streams that are provided on inlets and writes data to outlets.

+In case you are using the cloudflow-contrib model of integration, you need to go through some additional steps to complete the deployment of your Spark streamlets. This https://lightbend.github.io/cloudflow-contrib/docs/0.0.4/get-started/spark-native.html[section] on cloudflow-contrib has more details.


Do we need this comment?

Just thought of adding this here since there is no mention of cloudflow-contrib in this section on spark streamlets.

Started introducing cloudflow-contrib thoughts into cloudflow docs

c999559

debasishg requested review from RayRoestenburg and andreaTP May 7, 2021 05:33

Trying to fix error in dcos build

683ff73

andreaTP suggested changes May 7, 2021

View reviewed changes

RayRoestenburg reviewed May 7, 2021

View reviewed changes

debasishg added 2 commits May 10, 2021 11:16

Incorporated review feedback changes

fd019d5

cloudflow-contrib references added to Spark and Flink support sections

d493515

debasishg marked this pull request as ready for review May 11, 2021 05:23

andreaTP reviewed May 14, 2021

View reviewed changes

REview feedbacks

9322079

andreaTP approved these changes May 21, 2021

View reviewed changes

debasishg merged commit 3968724 into master May 24, 2021

debasishg deleted the docs-with-contrib branch May 24, 2021 04:49

Uh oh!

Conversation

debasishg commented May 7, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

debasishg commented May 11, 2021

Uh oh!

andreaTP left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

debasishg commented May 7, 2021 •

edited

Loading